Viral material and nucleotide fragments associated with multiple sclerosis, for diagnostic, prophylactic and therapeutic purposes

ABSTRACT

The invention provides viral material and nucleotide fragments associated with multiple sclerosis and/or rheumatoid arthritis for use in methods of diagnosis, prophylaxis, and therapy.

[0001] This application is a continuation-in-part of U.S. applicationSer. No. 08/756,429, filed Nov. 26, 1996.

[0002] Multiple sclerosis (MS) is a demyelinating disease of the centralnervous system (CNS), the cause of which remains as yet unknown.

[0003] Multiple sclerosis (MS) is the most common neurological diseaseof young adults with a prevalence in Europe and North America of between20 and 200 per 100,000. It is characterized clinically by arelapsing/remitting or chronic progressive course, frequently leading tosevere disability. Current knowledge suggests that MS is associated withautoimmunity, that genetic background has an important influence andthat “infectious” agent(s) may be involved. Indeed, many viruses havebeen proposed as possible candidates but as yet, none of them has beenshown to play an aetiological role.

[0004] Many studies have supported the hypothesis of a viral aetiologyof the disease, but none of the known viruses tested has proved to bethe causal agent sought: a review of the viruses sought for severalyears in MS has been compiled by E. Norrby (1) and R. T. Johnson (2).

[0005] The discovery of pathogenic retroviruses in man (HTLVs and HIVs)was followed by great interest in their ability to impair the immunesystem and to provoke central nervous system inflammation and/ordegeneration. In the case of HTLV-1, its association with a chronicinflammatory demyelinating disease in man (48) led to extensiveinvestigations to search for an HTLV1-like retrovirus in MS patients.However, despite initial claims, the presence of HTLV-1 or HTLV-likeretroviruses was not confirmed.

[0006] Recently, a retrovirus different from the known humanretroviruses has been isolated in patients suffering from MS (3, 4, and5).

[0007] In 1989, the authors described the production of extracellularvirions, associated with reverse transcriptase (RT) activity, by aculture of leptomeningeal cells (LM7) obtained from the cerebrospinalfluid of a patient with MS (3). This was followed by similar findings inmonocyte cultures from a series of MS patients (5). Neither viralparticles nor viral RT-activity were found in control individuals.Furthermore, the authors were able to transfer the LM7 virus tonon-infected leptomeningeal cells in vitro (26). The molecularcharacterization of the “LM7” retrovirus was a prerequisite for furtherevaluation of its possible role in MS. Considerable difficulties arosefrom the absence of continuously productive retroviral cultures and fromthe low levels of expression in the few transient cultures. The strategydescribed here focused on RNA from extracellular virions, in order toavoid non-specific detection of cellular RNA and of endogenous elementsfrom contaminating human DNA. A specific retroviral sequence associatedwith virions produced by cell cultures from several MS patients has beenidentified. The entire sequence of this novel retroviral genome iscurrently being obtained using RT-PCR on RNA from extracellular virions.The retrovirus previously called “LM7 virus” corresponds to an oncovirusand is now designated MSRV (Multiple Sclerosis-associated RetroVirus).

[0008] The authors were also able to show that this retrovirus could betransmitted in vitro, that patients suffering from MS producedantibodies capable of recognizing proteins associated with the infectionof leptomeningeal cells by this retrovirus, and that the expression ofthe latter could be strongly stimulated by the immediate-early genes ofsome herpesviruses (6).

[0009] All these results point to the role in MS of at least one unknownretrovirus or of a virus having reverse transcriptase activity which isdetectable according to the method published by H. Perron (3) andqualified as “LM7-like RT” activity. The content of the publicationidentified by (3) is incorporated in the present description byreference.

[0010] Recently, the Applicant's studies have enabled two continuouscell lines infected with natural isolates originating from two differentpatients suffering from MS to be obtained by a culture method asdescribed in the document WO-A-93/20188, the content of which isincorporated in the present description by reference. These two lines,derived from human choroid plexus cells, designated LM7PC and PLI-2,were deposited with the ECACC on Jul. 22, 1992 and Jan. 8, 1993,respectively, under numbers 92072201 and 93010817, in accordance withthe provisions of the Budapest Treaty. Moreover, the viral isolatespossessing LM7-like RT activity were also deposited with the ECACC underthe overall designation of “strains”. The “strain” or isolate harbouredby the PLI-2 line, designated POL-2, was deposited with the ECACC onJul. 22, 1992 under No. V92072202. The “strain” or isolate harboured bythe LM7PC line, designated MS7PG, was deposited with the ECACC on Jan.8, 1993 under No. V93010816.

[0011] Starting from the cultures and isolates mentioned above,characterized by biological and morphological criteria, the next stepwas to endeavour to characterize the nucleic acid material associatedwith the viral particles produced in these cultures.

[0012] The portions of the genome which have already been characterizedhave been used to develop tests for molecular detection of the viralgenome and immunoserological tests, using the amino acid sequencesencoded by the nucleotide sequences of the viral genome, in order todetect the immune response directed against epitopes associated with theinfection and/or viral expression.

[0013] These tools have already enabled an association to be confirmedbetween MS and the expression of the sequences identified in the patentscited later. However, the viral system discovered by the Applicant isrelated to a complex retroviral system. In effect, the sequences to befound encapsidated in the extracellular viral particles produced by thedifferent cultures of cells of patients suffering from MS show clearlythat there is coencapsidation of retroviral genomes which are relatedbut different from the “wild-type” retroviral genome which produces theinfective viral particles. This phenomenon has been observed betweenreplicative retroviruses and endogenous retroviruses belonging to thesame family, or even heterologous retroviruses. The notion of endogenousretroviruses is very important in the context of our discovery since, inthe case of MSRV-1, it has been observed that endogenous retroviralsequences comprising sequences homologous to the MSRV-1 genome exist innormal human DNA. The existence of endogenous retroviral elements (ERV)related to MSRV-1 by all or part of their genome explains the fact thatthe expression of the MSRV-1 retrovirus in human cells is able tointeract with closely related endogenous sequences. These interactionsare to be found in the case of pathogenic and/or infectious endogenousretroviruses (for example some ecotropic strains of the murine leukaemiavirus), and in the case of exogenous retroviruses whose nucleotidesequence may be found partially or wholly, in the form of ERVs, in thehost animal's genome (e.g. mouse exogenous mammary tumor virustransmitted via the milk). These interactions consist mainly of (i) atrans-activation or coactivation of ERVs by the replicative retrovirus(ii) and “illegitimate” encapsidation of RNAs related to ERVS, or ofERVs —or even of cellular RNAs—simply possessing compatibleencapsidation sequences, in the retroviral particles produced by theexpression of the replicative strain, which are sometimes transmissibleand sometimes with a pathogenicity of their own, and (iii) more or lesssubstantial recombinations between the coencapsidated genomes, inparticular in the phases of reverse transcription, which lead to theformation of hybrid genomes, which are sometimes transmissible andsometimes with a pathogenicity of their own.

[0014] Thus, (i) different sequences related to MSRV-1 have been foundin the purified viral particles; (ii) molecular analysis of thedifferent regions of the MSRV-1 retroviral genome should be carried outby systematically analyzing the coencapsidated, interfering and/orrecombined sequences which are generated by the infection and/orexpression of MSRV-1; furthermore, some clones may have defectivesequence portions produced by the retroviral replication and templateerrors and/or errors of transcription of the reverse transcriptase;(iii) the families of sequences related to the same retroviral genomicregion provide the means for an overall diagnostic detection which maybe optimized by the identification of invariable regions among theclones expressed, and by the identification of reading framesresponsible for the production of antigenic and/or pathogenicpolypeptides which may be produced only by a portion, or even by justone, of the clones expressed, and, under these conditions, thesystematic analysis of the clones expressed in the region of a givengene enables the frequency of variation and/or of recombination of theMSRV-1 genome in this region to be evaluated and the optimal sequencesfor the applications, in particular diagnostic applications, to bedefined; (iv) the pathology caused by a retrovirus such as MSRV-1 may bea direct effect of its expression and of the proteins or peptidesproduced as a result thereof, but also an effect of the activation, theencapsidation or the recombination of related or heterologous genomesand of the proteins or peptides produced as a result thereof; thus,these genomes associated with the expression of and/or infection byMSRV-1 are an integral part of the potential pathogenicity of thisvirus, and hence constitute means of diagnostic detection and specialtherapeutic targets. Similarly, any agent associated with or cofactor ofthese interactions responsible for the pathogenesis in question, such asMSRV-2 or the gliotoxic factor which are described in the patentapplication published under No. FR-2,716,198, may participate in thedevelopment of an overall and very effective strategy for the diagnosis,prognosis, therapeutic monitoring and/or integrated therapy of MS inparticular, but also of any other disease associated with the sameagents.

[0015] In this context, a parallel discovery has been made in anotherautoimmune disease, rheumatoid arthritis (RA), which has been describedin the French Patent Application filed under No. 95/02960. Thisdiscovery shows that, by applying methodological approaches similar tothe ones which were used in the Applicant's work on MS, it was possibleto identify a retrovirus expressed in RA which shares the sequencesdescribed for MSRV-1 in MS, and also the coexistence of an associatedMSRV-2 sequence also described in MS. As regards MSRV-1, the sequencesdetected in common in MS and RA relate to the pol and gag genes. In thecurrent state of knowledge, it is possible to associate the gag and polsequences described with the MSRV-1 strains expressed in these twodiseases.

[0016] The present patent application relates to various results whichare additional to those already protected by the following French PatentApplications:

[0017] No. 92/04322 of Mar. 4, 1992, published under U.S. Pat. No.2,689,519;

[0018] No. 92/13447 of Mar. 11, 1992, published under U.S. Pat. No.2,689,521;

[0019] No. 92/13443 of Mar. 11, 1992, published under U.S. Pat. No.2,689,520;

[0020] No. 94/01529 of Apr. 2, 1994, published under U.S. Pat. No.2,715,936;

[0021] No. 94/01531 of Apr. 2, 1994, published under U.S. Pat. No.2,715,939;

[0022] No. 94/01530 of Apr. 2, 1994, published under U.S. Pat. No.2,715,936;

[0023] No. 94/01532 of Apr. 2, 1994, published under U.S. Pat. No.2,715,937;

[0024] No. 94/14322 of Nov. 24, 1994, published under U.S. Pat. No.2,727,428;

[0025] and No. 94/15810 of Dec. 23, 1994; published under U.S. Pat. No.2,728,585.

SUMMARY OF THE INVENTION

[0026] The present invention relates, in the first place, to a viralmaterial, in the isolated or purified state, which may be recognized orcharacterized in different ways:

[0027] its genome comprises a nucleotide sequence chosen from the groupincluding the sequences SEQ ID NO: 42, SEQ ID NO: 47, SEQ ID NO: 48, SEQID NO: 49, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 55, SEQ ID NO: 56,SEQ ID NO: 57, SEQ ID NO: 83, their complementary sequences and theirequivalent sequences, in particular nucleotide sequences displaying, forany succession of 100 contiguous monomers, at least 50% and preferablyat least 70% homology with the said sequences SEQ ID NO: 42, SEQ ID NO:47, SEQ ID NO: 48, SEQ ID NO: 49, SEQ ID NO: 52, SEQ ID NO: 54, SEQ IDNO: 55, SEQ ID NO: 56, SEQ ID NO: 57, SEQ ID NO: 83, respectively, andtheir complementary sequences;

[0028] the region of its genome comprising the env and pol genes and aportion of the gag gene, excluding the subregion having a sequenceidentical or equivalent to SEQ ID NO: 1, codes for any polypeptidedisplaying, for any contiguous succession of at least 30 amino acids, atleast 50% and preferably at least 70% homology with a peptide sequenceencoded by any nucleotide sequence chosen from the group including SEQID NO: 42, SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID NO: 49, SEQ ID NO: 52,SEQ ID NO: 54, SEQ ID NO: 55, SEQ ID NO: 56, SEQ ID NO: 57, SEQ ID NO:83 and their complementary sequences;

[0029] the pol gene comprises a nucleotide sequence partially or totallyidentical or equivalent to SEQ ID NO: 53 or SEQ ID NO: 87, excluding SEQID NO: 1.

[0030] the gag gene comprises a nucleotide sequence partially or totallyidentical or equivalent to SEQ ID NO: 82.

[0031] As indicated above, according to the present invention, the viralmaterial as defined above is associated with MS. And as defined byreference to the pol or gag gene of MSRV-1, and more especially to thesequences SEQ ID NOS 47, 52, 53, 55, 56, 57, 82, 83, 87, 128, 129, 130,131, 135, 136, 137 and 138, this viral material is associated with RA.

[0032] The present invention also relates to a nucleic material, in theisolated or purified state, having at least one of the followingdefinitions

[0033] a nucleic material comprising a nucleotide sequence selected fromthe group including sequences SEQ ID NO: 87, SEQ ID NO: 88, SEQ ID NO:128, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, SEQ ID NO: 135, SEQID NO: 136, SEQ ID NO: 137, SEQ ID NO: 138, their complementarysequences and their equivalent sequences, in particular nucleotidesequences displaying, for any succession of 100 contiguous monomers, atleast 50% and preferably at least 60% homology with said sequences SEQID NO: 87, SEQ ID NO: 88, SEQ ID NO: 128, SEQ ID NO: 129, SEQ ID NO:130, SEQ ID NO: 131, SEQ ID NO: 135, SEQ ID NO: 136, SEQ ID NO: 137, SEQID NO: 138, and their complementary sequences, excluding HSERV-9 (orERV-9) ; advantageously, the nucleotide sequence of said nucleicmaterial is selected from the group including sequences SEQ ID NO: 87,SEQ ID NO: 88, SEQ ID NO: 128, SEQ ID NO: 129, SEQ ID NO: 130, SEQ IDNO: 131, SEQ ID NO: 135, SEQ ID NO: 136, SEQ ID NO: 137, SEQ ID NO: 138,their complementary sequences and their equivalent sequences, inparticular nucleotide sequences displaying, for any succession of 100contiguous monomers, at least 70% and preferably at least 80% homologywith said sequences SEQ ID NO: 87,SEQ ID NO: 88,SEQ ID NO: 128,SEQ IDNO: 129,SEQ ID NO: 130, SEQ ID NO: 131, SEQ ID NO: 135, SEQ ID NO: 136,SEQ ID NO: 137, SEQ ID NO: 138, and their complementary sequences;

[0034] a nucleic material, in the isolated or purified state, coding forany polypeptide displaying, for any contiguous succession of at least 30amino acids, at least 50%, preferably at least 60%, and most preferablyat least 70% homology with a peptide sequence encoded by any nucleotidesequence selected from the group including SEQ ID NO: 87, SEQ ID NO: 88,SEQ ID NO: 128, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, SEQ IDNO: 135, SEQ ID NO: 136, SEQ ID NO: 137, SEQ ID NO: 138 and theircomplementary sequences;

[0035] a nucleic material, in the isolated or purified state, ofretroviral type, comprising a nucleotide sequence identical or similarto at least part of the pol gene of an isolated retrovirus associatedwith multiple sclerosis or rheumatoid arthritis; advantageously, saidnucleotide sequence is 80% similar to said at least part of the genepol;

[0036] a nucleic material comprising a nucleotide sequence identical orsimilar to at least part of the pol gen of an isolated virus encoding areverse transcriptase having a enzymatic site comprised between theamino acid domains LPQG-YXDD, having a phylogenic distance with HSERV-9of 0.063±0.1, and preferably 0.063±0.05; the phylogenic distances arecalculated on the basis of a reference sequence according to UPGM treeoption of the Genework™ Software (INTELLIGENETICS); By enzymatic site,we understand the amino acids domain(s) conferring the specific activityof a given enzyme.

[0037] The present invention also relates to different nucleotidefragments each comprising a nucleotide sequence chosen from the groupincluding:

[0038] (a) all the genomic sequences, partial and total, of the pol geneof the MSRV-1 virus, except for the total sequence of the nucleotidefragment defined by SEQ ID NO: 1;

[0039] (b) all the genomic sequences, partial and total, of the env geneof MSRV-1;

[0040] (c) all the partial genomic sequences of the gag gene of MSRV-1;

[0041] (d) all the genomic sequences overlapping the pol gene and theenv gene of the MSRV-1 virus, and overlapping the pol gene and the gaggene;

[0042] (e) all the sequences, partial and total, of a clone chosen fromthe group including the clones FBd3 (SEQ ID NO: 42), t pol (SEQ ID NO:47), JLBc1 (SEQ ID NO: 48), JLBc2 (SEQ ID NO: 49) and GM3 (SEQ ID NO:52), FBd13 (SEQ ID NO: 54), LB19 (SEQ ID NO: 55), LTRGAG12 (SEQ ID NO:56), FP6 (SEQ ID NO: 57), G+E+A (SEQ ID NO: 83), excluding anynucleotide sequence identical to or lying within the sequence defined bySEQ ID NO: 1;

[0043] (f) sequences complementary to the said genomic sequences;

[0044] (g) sequences equivalent to the said sequences (a) to (e), inparticular nucleotide sequences displaying, for any succession of 100contiguous monomers, at least 50% and preferably at least 70% homologywith the said sequences (a) to (d), provided that this nucleotidefragment does not comprise or consist of the sequence ERV-9 as describedin LA MANTIA et al. (18).

[0045] The term genomic sequences, partial or total, includes allsequences associated by coencapsidation or by coexpression, orrecombined sequences.

[0046] Preferably, such a fragment comprises:

[0047] either a nucleotide sequence identical to a partial or totalgenomic sequence of the pol gene of the MSRV-1 virus, except for thetotal sequence of the nucleotide fragment defined by SEQ ID NO: 1, oridentical to any sequence equivalent to the said partial or totalgenomic sequence, in particular one which is homologous to the latter;

[0048] or a nucleotide sequence identical to a partial or total genomicsequence of the env gene of the MSRV-1 virus, or identical to anysequence complementary to the said nucleotide sequence, or identical toany sequence equivalent to the said nucleotide sequence, in particularone which is homologous to the latter.

[0049] In particular, the invention relates to a nucleotide fragmentcomprising a coding nucleotide sequence which is partially or totallyidentical to a nucleotide sequence chosen from the group including:

[0050] the nucleotide sequence defined by SEQ ID NO: 36, SEQ ID NO: 58or SEQ ID NO: 83;

[0051] sequences complementary to SEQ ID NO: 36, SEQ ID NO: 58 or SEQ IDNO: 83;

[0052] sequences equivalent, and in particular homologous to SEQ ID NO:36, SEQ ID NO: 58 or SEQ ID NO: 83;

[0053] sequences coding for all or part of the peptide sequence definedby SEQ ID NO: 35, SEQ ID NO: 59 or SEQ ID NO: 84;

[0054] sequences coding for all or part of a peptide sequenceequivalent, in particular homologous to SEQ ID NO: 35, SEQ ID NO: 59 orSEQ ID NO: 84, which is capable of being recognized by sera of patientsinfected with the MSRV-1 virus, or in whom the MSRV-1 virus has beenreactivated.

[0055] The invention also relates to a nucleotide fragment (calledfragment I) having at least one of the following definitions:

[0056] a nucleotide fragment comprising a nucleotide sequence selectedfrom the group including SEQ ID NO: 87, SEQ ID NO: 88, SEQ ID NO: 128,SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, SEQ ID NO: 135, SEQ IDNO: 136, SEQ ID NO: 137, SEQ ID NO: 138, their complementary sequences,and their equivalent sequences, in particular nucleotide sequencesdisplaying, for any succession of 100 contiguous monomers, at least 50%and preferably at least 60% homology with said sequences and theircomplementary sequences, said group excluding SEQ ID NO: 1, saidnucleotide fragment not comprising nor consisting of the sequenceHSERV-9 (or ERV-9); preferably the nucleotide sequence of said fragmentis selected from the group including SEQ ID NO: 87, SEQ ID NO: 88, SEQID NO: 128, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, SEQ ID NO:135, SEQ ID NO: 136, SEQ ID NO: 137, SEQ ID NO: 138, their complementarysequences, and their equivalent sequences, in particular nucleotidesequences displaying, for any succession of 100 contiguous monomers, atleast 70% and preferably at least 80% homology with said sequences andtheir complementary sequences;

[0057] a nucleotide fragment comprising a coding nucleotide sequencewhich is partially or totally identical to a nucleotide sequenceselected from the group including:

[0058] SEQ ID NO: 87, SEQ ID NO: 88, SEQ ID NO: 128, SEQ ID NO: 129, SEQID NO: 130, SEQ ID NO: 131, SEQ ID NO: 135, SEQ ID NO: 136, SEQ ID NO:137, SEQ ID NO: 138 ; their complementary sequences; their equivalentsequences, in particular homologous to SEQ ID NO: 87, SEQ ID NO: 88, SEQID NO: 128, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, SEQ ID NO:135, SEQ ID NO: 136, SEQ ID NO: 137, SEQ ID NO: 138;

[0059] sequences encoding all or parts of the peptide sequence definedby SEQ ID NO: 89, SEQ ID NO: 132, SEQ ID NO.133, SEQ ID NO: 134, SEQ IDNO: 139, SEQ ID NO: 140, SEQ ID NO: 141;

[0060] sequences encoding all or parts of a peptide sequence equivalent,in particular homologous to SEQ ID NO: 89, SEQ ID NO: 132, SEQ ID NO:133, SEQ ID NO: 134, SEQ ID NO: 139, SEQ ID NO: 140, SEQ ID NO: 141,which is capable of being recognized by sera of patients infected withthe MSRV-1 virus, or in whom the MSRV-1 virus has been reactivated.

[0061] The invention also relates to any nucleic acid probe for thedetection of virus associated with MS and/or rheumatoid arthritis (RA),which is capable of hybridizing specifically with any fragment such asis defined above, belonging or lying within the genome of the saidpathogenic agent. It relates, in addition, to any nucleic acid probe fordetection of a pathogenic and/or infective agent associated with RA,which is capable of hybridizing specifically with any fragment asdefined above by reference to the pol and gag genes, and especially withrespect to the sequences SEQ ID NOS 36, 47, 52, 55, 56, 57, 58, 83 andSEQ ID NOS 35, 59 and 84.

[0062] The invention also relates to a primer for the amplification bypolymerization of an RNA or a DNA of a viral material, associated withMS and/or RA, comprising a nucleotide sequence identical or equivalentto at least one portion of the nucleotide sequence of any fragment suchas is defined above, in particular a nucleotide sequence displaying, forany succession of at least 10 contiguous monomers, preferably 15contiguous monomers, more preferably 18 contiguous monomers and evenmost preferably 20 contiguous monomers, at least 70% homology with atleast the said portion of the said fragment. Preferably, the nucleotidesequence of such a primer is identical to any one of the sequencesselected from the group including SEQ ID NO: 15 to SEQ ID NO: 18, SEQ IDNO: 43 to SEQ ID NO: 46, SEQ ID NO: 51, SEQ ID NO: 60, SEQ ID NO: 72,SEQ ID NO: 76, SEQ ID NO: 80, SEQ ID NO: 93 to SEQ ID NO: 99, SEQ ID NO:142, SEQ ID NO: 143, SEQ ID NO: 144, and SEQ ID NO: 145.

[0063] Generally speaking the invention also encompasses any RNA or DNA,and in particular replication vector, comprising a genomic fragment ofthe viral material such as is defined above, or a nucleotide fragmentsuch as is defined above.

[0064] The invention also relates to the different peptides encoded byany open reading frame belonging to a nucleotide fragment such as isdefined above, in particular any polypeptide, for example anyoligopeptide forming or comprising an antigenic determinant recognizedby sera of patients infected with the MSRV-1 virus and/or in whom theMSRV-1 virus has been reactivated. Preferably, this polypeptide isantigenic, and is encoded by the open reading frame beginning, in the5′-3′ direction, at nucleotide 181 and ending at nucleotide 330 of SEQID NO: 1.

[0065] The invention also encompasses the following polypeptides:

[0066] a)

[0067] a polypeptide encoded by any open reading frame belonging to anucleotide fragment, fragment I, as defined above;

[0068] a polypeptide, characterized in that the open reading frameencoding it, is comprised, in the 5′-3′ direction, between nucleotide 18and nucleotide 2304 of SEQ ID NO: 87;

[0069] a polypeptide, having a peptide sequence comprising a sequencepartially or totally identical to SEQ ID NO: 89;

[0070] b)

[0071] a polypeptide, recombinant or synthetic, having a peptidesequence which comprises a sequence identical or equivalent to SEQ IDNO: 90; in particular said polypeptide exhibits an enzymatic activityconsisting of proteolytic activity;

[0072] a polypeptide, recombinant or synthetic, characterized in thatthe open reading frame encoding it begins, in the 5′-3′ direction, atnucleotide 18 and ends at nucleotide 340 of SEQ ID NO: 87;

[0073] a polypeptide having an inhibitory activity on the proteolyticactivity of a polypeptide as defined according to b);

[0074] c)

[0075] a polypeptide, recombinant or synthetic, having a peptidesequence which comprises a sequence identical or equivalent to SEQ IDNO: 91; in particular said polypeptide exhibits a reverse transcriptaseactivity;

[0076] a polypeptide having a peptide sequence which comprises asequence identical or equivalent to SEQ ID NO: 92; in particular saidpolypeptide exhibits a ribonuclease activity;

[0077] a polypeptide, recombinant or synthetic, characterized in thatthe open reading frame encoding it begins, in the 5′-3′ direction, atnucleotide 341 and ends at nucleotide 2304 of SEQ ID NO: 87;

[0078] a polypeptide, recombinant or synthetic, characterized in thatthe open reading frame encoding it begins, in the 5′-3′ direction, atnucleotide 1858 and ends at nucleotide 2304 of SEQ ID NO.87.

[0079] a polypeptide having an inhibitory activity on the reversetranscriptase activity of a polypeptide as defined according to c) or onthe ribonuclease H activity of a polypeptide as defined according to c).

[0080] In particular, the invention relates to an antigenic polypeptiderecognized by the sera of patients infected with the MSRV-1 virus,and/or in whom the MSRV-1 virus has been reactivated, whose peptidesequence is partially or totally identical or is equivalent to thesequence defined by SEQ ID NO: 35, SEQ ID NO: 59, SEQ ID NO: 81, SEQ IDNO: 89, SEQ ID NO: 90, SEQ ID NO: 91, SEQ ID NO: 92, SEQ ID NO: 132, SEQID NO: 133, SEQ ID NO: 134, SEQ ID NO: 139, SEQ ID NO: 140 and SEQ IDNO: 141; such a sequence is identical, for example, to any sequenceselected from the group including the sequences SEQ ID NO: 37 to SEQ IDNO: 40, SEQ ID NO: 59 and SEQ ID NO: 81.

[0081] The present invention also encompasses mono- or polyclonalantibodies directed against the MSRV-1 virus, which are obtained by theimmunological reaction of a human or animal body or cells to animmunogenic agent consisting of an antigenic polypeptide such as isdefined above.

[0082] The invention next relates to:

[0083] reagents for detection of the MSRV-virus, or of an exposure tothe latter, comprising, at least one reactive substance selected fromthe group consisting of a probe of the present invention, a polypeptide,in particular an antigenic peptide, such as is defined above, or ananti-ligand, in particular an antibody to the said polypeptide;

[0084] all diagnostic, prophylactic or therapeutic compositionscomprising one or more peptides, in particular antigenic peptides, suchas are defined above, or one or more anti-ligands, in particularantibodies to the peptides, discussed above; such a composition ispreferably, and by way of example, a vaccine composition.

[0085] The invention also relates to any diagnostic, prophylactic ortherapeutic composition, in particular for inhibiting the expression ofat least one virus associated with MS or RA, and/or the enzymaticactivities of the proteins of said virus, comprising a nucleotidefragment such as is defined above or a polynucleotide, in particularoligonucleotide, whose sequence is partially identical to that of thesaid fragment, except for that of the fragment having the nucleotidesequence SEQ ID NO: 1. Likewise, it relates to any diagnostic,prophylactic or therapeutic composition, in particular for inhibitingthe expression of at least one pathogenic and/or infective agentassociated with RA, comprising a nucleotide fragment such as is definedabove by reference to the pol and gag genes, and especially with respectto the sequences SEQ ID NOS 36, 47, 52, 55, 56, 57, 58 and 83.

[0086] According to the invention, these same fragments orpolynucleotides, in particular oligonucleotides, may participate in allsuitable compositions for detecting, according to any suitable processor method, a pathological and/or infective agent associated with MS andwith RA, respectively, in a biological sample. In such a process, an RNAand/or a DNA presumed to belong or originating from the saidpathological and/or infective agent, and/or their complementary RNAand/or DNA, is/are brought into contact with such a composition.

[0087] The present invention also relates to any process for detectingthe presence or exposure to such a pathological and/or infective agent,in a biological sample, by bringing this sample into contact with apeptide, in particular an antigenic peptide such as is defined above, oran anti-ligand, in particular an antibody to this peptide, such as isdefined above.

[0088] In practice, and for example, a device for detection of theMSRV-1 virus comprises a reagent such as is defined above, supported bya solid support which is immunologically compatible with the reagent,and a means for bringing the biological sample, for example a sample ofblood or of cerebrospinal fluid, likely to contain anti-MSRV-1antibodies, into contact with this reagent under conditions permitting apossible immunological reaction, the foregoing items being accompaniedby means for detecting the immune complex formed with this reagent.

[0089] Lastly, the invention also relates to the detection ofanti-MSRV-1 antibodies in a biological sample, for example a sample ofblood or of cerebrospinal fluid, according to which this sample isbrought into contact with a reagent such as is defined above, consistingof an antibody, under conditions permitting their possible immunologicalreaction, and the presence of the immune complex thereby formed with thereagent is then detected.

DEFINITIONS

[0090] Before describing the invention in detail, different terms usedin the description and the claims are now defined:

[0091] strain or isolate is understood to mean any infective and/orpathogenic biological fraction containing, for example, viruses and/orbacteria and/or parasites, generating pathogenic and/or antigenic power,harbored by a culture or a living host; as an example, a viral strainaccording to the above definition can contain a coinfective agent, forexample a pathogenic protist,

[0092] the term “MSRV” used in the present description denotes anypathogenic and/or infective agent associated with MS, in particular aviral species, the attenuated strains of the said viral species or thedefective-interfering particles or particles containing coencapsidatedgenomes, or alternatively genomes recombined with a portion of theMSRV-1 genome, derived from this species. Viruses, and especiallyviruses containing RNA, are known to have a variability resulting, inparticular, from relatively high rates of spontaneous mutation (7),which will be borne in mind below for defining the notion ofequivalence,

[0093] human virus is understood to mean a virus capable of infecting,or of being harbored by human beings,

[0094] in view of all the natural or induced variations and/orrecombination which may be encountered when implementing the presentinvention, the subjects of the latter, defined above and in the claims,have been expressed including the equivalents or derivatives of thedifferent biological materials defined below, in particular of thehomologous nucleotide or peptide sequences,

[0095] the variant of a virus or of a pathogenic and/or infective agentaccording to the invention comprises at least one antigen recognized byat least one antibody directed against at least one correspondingantigen of the said virus and/or said pathogenic and/or infective agent,and/or a genome any part of which is detected by at least onehybridization probe and/or at least one nucleotide amplification primerspecific for the said virus and/or pathogenic and/or infective agent,such as, for example, for the MSRV-1 virus, the primers and probeshaving a nucleotide sequence chosen from SEQ ID NO: 15 to SEQ ID NO: 21,SEQ ID NO: 23, SEQ ID NO: 27 to SEQ ID NO: 29, SEQ ID NO: 41, SEQ ID NO:43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, and their complementarysequences, under particular hybridization conditions well known to aperson skilled in the art,

[0096] according to the invention, a nucleotide fragment or anoligonucleotide or polynucleotide is an arrangement of monomers, or abiopolymer, characterized by the informational sequence of the naturalnucleic acids, which is capable of hybridizing with any other nucleotidefragment under predetermined conditions, it being possible for thearrangement to contain monomers of different chemical structures and tobe obtained from a molecule of natural nucleic acid and/or by geneticrecombination and/or by chemical synthesis; a nucleotide fragment may beidentical to a genomic fragment of the MSRV-1 virus discussed in thepresent invention, in particular a gene of this virus, for example polor env in the case of the said virus,

[0097] thus, a monomer can be a natural nucleotide of nucleic acid whoseconstituent elements are a sugar, a phosphate group and a nitrogenousbase; in RNA the sugar is ribose, in DNA the sugar is 2-deoxyribose;depending on whether the nucleic acid is DNA or RNA, the nitrogenousbase is chosen from adenine, guanine, uracil, cytosine and thymine; orthe nucleotide can be modified in at least one of the three constituentelements; as an example, the modification can occur in the bases,generating modified bases such as inosine, 5-methyldeoxycytidine,deoxyuridine, 5-(dimethylamino)deoxyuridine, 2,6-diaminopurine,5-bromodeoxyuridine and any other modified base promoting hybridization;in the sugar, the modification can consist of the replacement of atleast one deoxyribose by a polyamide (8), and in the phosphate group,the modification can consist of its replacement by esters chosen, inparticular, from diphosphate, alkyl- and arylphosphonate andphosphorothioate esters,

[0098] “informational sequence” is understood to mean any orderedsuccession of monomers whose chemical nature and order in a referencedirection constitute an item of functional information of the samequality as that of the natural nucleic acids,

[0099] hybridization is understood to mean the process during which,under suitable working conditions, two nucleotide fragments havingsufficiently complementary sequences pair to form a complex structure,in particular double or triple, preferably in the form of a helix,

[0100] a probe comprises a nucleotide fragment synthesized chemically orobtained by digestion or enzymatic cleavage of a longer nucleotidefragment, comprising at least six monomers, advantageously from 10 to1000 monomers, preferably 10 to 30 monomers and more preferably 18 to30, and possessing a specificity of hybridization under particularconditions; preferably, a probe possessing fewer than 10 monomers, butpreferably fewer than 15 monomers is not used alone, but is used in thepresence of other probes of equally short size or otherwise; undercertain special conditions, it may be useful to use probes of sizegreater than 100 monomers; a probe may be used, in particular, fordiagnostic purposes, such molecules being, for example, capture and/ordetection probes,

[0101] the capture probe may be immobilized on a solid support by anysuitable means, that is to say directly or indirectly, for example bycovalent bonding or passive adsorption,

[0102] the detection probe may be labelled by means of a label chosen,in particular, from radioactive isotopes, enzymes chosen, in particular,from peroxidase and alkaline phosphatase and those capable ofhydrolyzing a chromogenic, fluorogenic or luminescent substrate,chromophoric chemical compounds, chromogenic, fluorogenic or luminescentcompounds, nucleotide base analogues and biotin,

[0103] the probes used for diagnostic purposes of the invention may beemployed in all known hybridization techniques, and in particular thetechniques termed “DOT-BLOT” (9), “SOUTHERN BLOT” (10), “NORTHERN BLOT”,which is a technique identical to the “SOUTHERN BLOT” technique butwhich uses RNA as target, and the SANDWICH technique (11);advantageously, the SANDWICH technique is used in the present invention,comprising a specific capture probe and/or a specific detection probe,on the understanding that the capture probe and the detection probe mustpossess an at least partially different nucleotide sequence,

[0104] any probe according to the present invention can hybridize invivo or in vitro with RNA and/or with DNA in order to block thephenomena of replication, in particular translation and/ortranscription, and/or to degrade the said DNA and/or RNA,

[0105] a primer is a probe comprising at least six monomers, andadvantageously from 10 to 30 monomers, and preferably from 18 to 25monomers, possessing a specificity of hybridization under particularconditions for the initiation of an enzymatic polymerization, forexample in an amplification technique such as PCR (polymerase chainreaction), in an elongation process such as sequencing, in a method ofreverse transcription or the like,

[0106] two nucleotide or peptide sequences are termed equivalent orderived with respect to one another, or with respect to a referencesequence, if functionally the corresponding biopolymers can performsubstantially the same role, without being identical, as regards theapplication or use in question, or in the technique in which theyparticipate; two sequences are, in particular, equivalent if they areobtained as a result of natural variability, in particular spontaneousmutation of the species from which they have been identified, or inducedvariability, as are two homologous sequences, homology being definedbelow,

[0107] “variability” is understood to mean any spontaneous or inducedmodification of a sequence, in particular by substitution and/orinsertion and/or deletion of nucleotides and/or of nucleotide fragments,and/or extension and/or shortening of the sequence at one or both ends;an unnatural variability can result from the genetic engineeringtechniques used, for example the choice of synthesis primers, degenerateor otherwise, selected for amplifying a nucleic acid; this variabilitycan manifest itself in modifications of any starting sequence,considered as reference, and capable of being expressed by a degree ofhomology relative to the said reference sequence,

[0108] homology characterizes the degree of identity of two nucleotideor peptide fragments compared; it is measured by the percentage identitywhich is determined, in particular, by direct comparison of nucleotideor peptide sequences, relative to reference nucleotide or peptidesequences,

[0109] this percentage identity has been specifically determined for thenucleotide fragments, clones in particular, dealt with in the presentinvention, which are homologous to the fragments identified, for theMSRV-1 virus, by SEQ ID NO: 1 to NO: 9, SEQ ID NO: 42, SEQ ID NO: 47 toSEQ ID NO: 49, SEQ ID NO: 36, SEQ ID NO: 52, SEQ ID NO: 53 and SEQ IDNO: 87, as well as for the probes and primers homologous to the probesand primers identified by SEQ ID NO: 17 to SEQ ID NO: 21, SEQ ID NO: 23,SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 27 to SEQ ID NO: 29, SEQ ID NO:41, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, SEQ IDNO: 51, SEQ ID NO: 36, SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 72, SEQID NO: 76, and SEQ ID NO: 93 to SEQ ID NO: 99; as an example, thesmallest percentage identity observed between the different generalconsensus sequences of nucleic acids obtained from fragments of MSRV-1viral RNA, originating from the LM7PC and PLI-2 lines according to aprotocol detailed later, is 67% in the region described in FIG. 1,

[0110] any nucleotide fragment is termed equivalent or derived from areference fragment if it possesses a nucleotide sequence equivalent tothe sequence of the reference fragment; according to the abovedefinition, the following in particular are equivalent to a referencenucleotide fragment:

[0111] a) any fragment capable of hybridizing at least partially withthe complement of the reference fragment,

[0112] b) any fragment whose alignment with the reference fragmentresults in the demonstration of a larger number of identical contiguousbases than with any other fragment originating from another taxonomicgroup,

[0113] c) any fragment resulting, or capable of resulting, from thenatural variability of the species from which it is obtained,

[0114] d) any fragment capable of resulting from the genetic engineeringtechniques applied to the reference fragment,

[0115] e) any fragment containing at least eight contiguous nucleotidesencoding a peptide which is homologous or identical to the peptideencoded by the reference fragment,

[0116] f) any fragment which is different from the reference fragment byinsertion, deletion or substitution of at least one monomer, orextension or shortening at one or both of its ends; for example, anyfragment corresponding to the reference fragment flanked at one or bothof its ends by a nucleotide sequence not coding for a polypeptide,

[0117] polypeptide is understood to mean, in particular, any peptide ofat least two amino acids, in particular an oligopeptide, or protein, andfor example an enzyme, extracted, separated or substantially isolated orsynthesized through human intervention, in particular those obtained bychemical synthesis or by expression in a recombinant organism,

[0118] polypeptide partially encoded by a nucleotide fragment isunderstood to mean a polypeptide possessing at least three amino acidsencoded by at least nine contiguous monomers lying within the saidnucleotide fragment,

[0119] an amino acid is termed analogous to another amino acid whentheir respective physicochemical properties, such as polarity,hydrophobicity and/or basicity and/or acidity and/or neutrality aresubstantially the same; thus, a leucine is analogous to an isoleucine.

[0120] any polypeptide is termed equivalent or derived from a referencepolypeptide if the polypeptides compared have substantially the sameproperties, and in particular the same antigenic, immunological,enzymological and/or molecular recognition properties; the following inparticular are equivalent to a reference polypeptide:

[0121] a) any polypeptide possessing a sequence in which at least oneamino acid has been replaced by an analogous amino acid,

[0122] b) any polypeptide having an equivalent peptide sequence,obtained by natural or induced variation of the said referencepolypeptide and/or of the nucleotide fragment coding for the saidpolypeptide,

[0123] c) a mimotope of the said reference polypeptide,

[0124] d) any polypeptide in whose sequence one or more amino acids ofthe L series are replaced by an amino acid of the D series, and viceversa,

[0125] e) any polypeptide into whose sequence a modification of the sidechains of the amino acids has been introduced, such as, for example, anacetylation of the amine functions, a carboxylation of the thiolfunctions, an esterification of the carboxyl functions,

[0126] f) any polypeptide in whose sequence one or more peptide bondshave been modified, such as, for example, carba, retro, inverso,retro-inverso, reduced and methylenoxy bonds,

[0127] (g) any polypeptide at least one antigen of which is recognizedby an antibody directed against a reference polypeptide,

[0128] the percentage identity characterizing the homology of twopeptide fragments compared is, according to the present invention, atleast 50% and preferably at least 70%.

[0129] In view of the fact that a virus possessing reverse transcriptaseenzymatic activity may be genetically characterized equally well in RNAand in DNA form, both the viral DNA and RNA will be referred to forcharacterizing the sequences relating to a virus possessing such reversetranscriptase activity, termed MSRV-1 according to the presentdescription.

[0130] The expressions of order used in the present description and theclaims, such as “first nucleotide sequence”, are not adopted so as toexpress a particular order, but so as to define the invention moreclearly.

[0131] Detection of a substance or agent is understood below to meanboth an identification and a quantification, or a separation orisolation, of the said substance or said agent.

BRIEF DESCRIPTION OF THE DRAWINGS

[0132] A better understanding of the invention will be gained on readingthe detailed description which follows, prepared with reference to theattached figures, in which:

[0133]FIG. 1 shows general consensus sequences of nucleic acids of theMSRV-1B clones amplified by the PCR technique in the “pol” regiondefined by Shih (12), from viral DNA originating from the LM7PC andPLI-2 lines, and identified under the references SEQ ID NO: 3, SEQ IDNO: 4, SEQ ID NO: 5 and SEQ ID NO: 6, and the common consensus withamplification primers bearing the reference SEQ ID NO: 7;

[0134]FIG. 2 gives the definition of a functional reading frame for eachMSRV-1B/“PCR pol” type family, the said families A to D being defined,respectively, by the nucleotide sequences SEQ ID NO: 3, SEQ ID NO: 4,SEQ ID NO: 5 and SEQ ID NO: 6 described in FIG. 1;

[0135]FIG. 3 gives an example of consensus of the MSRV-2B sequences,identified by SEQ ID NO: 11;

[0136]FIG. 4 is a representation of the reverse transcriptase (RT)activity in dpm (disintegrations per minute) in the sucrose fractionstaken from a purification gradient of the virions produced by the Blymphocytes in culture from a patient suffering from MS;

[0137]FIG. 5 gives, under the same experimental conditions as in FIG. 4,the assay of the reverse transcriptase activity in the culture of a Blymphocyte line obtained from a control free from MS;

[0138]FIG. 6 shows the nucleotide sequence of the clone PSJ17 (SEQ IDNO: 9);

[0139]FIG. 7 shows the nucleotide sequence SEQ ID NO: 8 of the clonedesignated M003-P004;

[0140]FIG. 8 shows the nucleotide sequence SEQ ID NO: 2 of the cloneF11-1; the portion located between the two arrows in the region of theprimer corresponds to a variability imposed by the choice of primerwhich was used for the cloning of F11-1; in this same Figure, thetranslation into amino acids is shown;

[0141]FIG. 9 shows the nucleotide sequence SEQ ID NO: 1, and a possiblefunctional reading frame of SEQ ID NO: 1 in terms of amino acids; onthis sequence, the consensus sequences of the pol gene are underlined;

[0142]FIGS. 10 and 11 give the results of a PCR, in the form of aphotograph under ultraviolet light of an ethidium bromide-impregnatedagarose gel, of the amplification products obtained from the primersidentified by SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17 and SEQ ID NO:18;

[0143]FIG. 12 gives a representation in matrix form of the homologybetween SEQ ID NO: 1 of MSRV-1 and that of an endogenous retrovirusdesignated HSERV9; this homology of at least 65% is demonstrated by acontinuous line, the absence of a line meaning a homology of less than65%;

[0144]FIG. 13 shows the nucleotide sequence SEQ ID NO: 42 of the cloneFBd3;

[0145]FIG. 14 shows the sequence homology between the clone FBd3 and theHSERV-9 retrovirus;

[0146]FIG. 15 shows the nucleotide sequence SEQ ID NO: 47 of the clone tpol;

[0147]FIGS. 16 and 17 show, respectively, the nucleotide sequences SEQID NO: 48 and SEQ ID NO: 49 of the clones JLBc1 and JLBc2, respectively;

[0148]FIG. 18 shows the sequence homology between the clone JLBc1 andthe clone FBd3;

[0149] and FIG. 19 the sequence homology between the clone JLBc2 and theclone FBd3;

[0150]FIG. 20 shows the sequence homology between the clones JLBc1 andJLBc2;

[0151]FIGS. 21 and 22 show the sequence homology between the HSERV-9retrovirus and the clones JLBc1 and JLBc2, respectively;

[0152]FIG. 23 shows the nucleotide sequence SEQ ID NO: 52 of the cloneGM3;

[0153]FIG. 24 shows the sequence homology between the HSERV-9 retrovirusand the clone GM3;

[0154]FIG. 25 shows the localization of the different clones studied,relative to the genome of the known retrovirus ERV9;

[0155]FIG. 26 shows the position of the clones F11-1, M003-P004, MSRV-1Band PSJ17 in the region hereinafter designated MSRV-1 pol*;

[0156]FIG. 27, split into three successive FIGS. 27a-27 c, shows apossible reading frame covering the whole of the pol gene;

[0157]FIG. 28 shows, according to SEQ ID NO: 36, the nucleotide sequencecoding for the peptide fragment POL2B, having the amino acid sequenceidentified by SEQ ID NO: 35;

[0158]FIG. 29 shows the OD values (ELISA tests) at 492 nm obtained for29 sera of MS patients and 32 sera of healthy controls tested with ananti-IgG antibody;

[0159]FIG. 30 shows the OD values (ELISA tests) at 492 nm obtained for36 sera of MS patients and 42 sera of healthy controls tested with ananti-IgM antibody;

[0160] FIGS. 31 to 33 show the results obtained (relative intensity ofthe spots) for 43 overlapping octapeptides covering the amino acidsequence 61-110, according to the Spotscan technique, respectively witha pool of MS sera, with a pool of control sera and with the pool of MSsera after deduction of a background corresponding to the maximum signaldetected on at least one octapeptide with the control serum(intensity=1), on the understanding that these sera were diluted to1/50. The bar at the far right-hand end represents a graphic scalestandard unrelated to the serological test;

[0161]FIG. 34 shows the SEQ ID NO: 37 and SEQ ID NO: 38 of twopolypeptides comprising immunodominant regions, while SEQ ID NO: 39 and40 represent immunoreactive polypeptides specific to MS;

[0162]FIG. 35 shows the nucleotide sequence SEQ ID NO: 55 of the cloneLB19 and three potential reading frames of SEQ ID NO: 55 in terms ofamino acids;

[0163]FIG. 36 shows the nucleotide sequence SEQ ID NO: 82 (GAG*) and apotential reading frame of SEQ ID NO: 82 in terms of amino acids;

[0164]FIG. 37 shows the sequence homology between the clone FBd13 andthe HSERV-9 retrovirus; according to this representation, the continuousline means a percentage homology greater than or equal to 70% and theabsence of a line means a smaller percentage homology;

[0165]FIG. 38 shows the nucleotide sequence SEQ ID NO: 57 of the cloneFP6 and three potential reading frames of SEQ ID NO: 57 in terms ofamino acids;

[0166]FIG. 39 shows the nucleotide sequence SEQ ID NO: 83 of the cloneG+E+A and three potential reading frames of SEQ ID NO: 83 in terms ofamino acids;

[0167]FIG. 40 shows a reading frame found in the region E and coding foran MSRV-1 retroviral protease identified by SEQ ID NO: 84;

[0168]FIG. 41 shows the response of each serum of patients sufferingfrom MS, indicated by the symbol (+), and of healthy patients,symbolised by (−), tested with an anti-IgG antibody, expressed as netoptical density at 492 nm;

[0169]FIG. 42 shows the response of each serum of patients sufferingfrom MS, indicated by the symbols (+) and (QS), and of healthy patients(−), tested with an anti-IgM antibody, expressed as net optical densityat 492 nm;

[0170]FIG. 43 shows the RT-activity profile in sucrose density gradientsof pellets from B-cell line supernatants; Control B-cell line n wasobtained from the relative of a patient with mitochondriopathy. MSB-Cell line o was obtained from a patient with definite MS;

[0171]FIG. 44 shows the nucleotide and amino acid alignment of theconserved pol regions of viruses detected in the study (cf Example 18)by the “Pan-retrovirus” PCR. “Deletions” are represented by dashes andstandard single-letter abbreviations are used to designate amino acidsand nucleotides (i=inosine). The most highly conserved VLPQG and YXDDregions are shown as separate blocks in bold type at the end of eachsequence. Amino acids which are present in all or in all but one of thesequences are underlined. PCR primers (modified from (12)) PAN-UO andPAN-UI are orientated 5′ to 3′ (sense) whereas primer PAN-DI is 3′ to 5′(antisense). Degeneracies are shown above (PAN-UO & PAN-DI) or below(PAN-UI) the PCR primer sequences. “I” denotes the nine base 5′extension cttggatcc, “-I” denotes the nine base 5′ extension ctcaagctt.The capture and detector probes DpV1 and CpV1b used in the ELOSA assayare shown below a representative MSRV-cpol sequence. At three positionsbelow the translated MSRV-cpol sequence alternative amino acids(representing “non-silent” nucleic acid variations) are shown initalics—K and Y substitutions were only observed in PLI-1 derived cloneswhereas R and W were encoded by a significant proportion of the clonesirrespective of derivation. Note that DpV1 is peroxidase labelled andthat CpV1b may be biotinylated at the 5′ end if streptavidin coatedplates are used. The name of each sequence is indicated at the left ofthe figure.

[0172] HTLV1: Human Leukaemia Virus type 1; HIV1: Human ImmunodeficiencyVirus type 1; MoMLV: Moloney-Murine Leukaemia Virus; MPMV: Mason-PfizerMonkey Virus. ERV9: Endogenous Retrovirus 9. MSRV-cpol: MultipleSclerosis associated RetroVirus conserved pol region.

[0173]FIG. 45 shows a phylogenic tree which is based on the conservedamino acid region encoded by the pol gene of MSRV and of representativeendogenous and exogenous retroviruses and DNA viruses with reversetranscriptase. It was generated by the U.P.G.M.A. tree program ofGeneworks® software.

[0174] HSRV: Human Spumaretrovirus. EIAV: Equine Infectious AenemiaVirus. BLV: Bovine Leukaemia Virus. HIV1, HIV2: Human ImmunodeficiencyViruses type 1 and 2. HTLV1 and HTLV2: Human Leukaemia Viruses type 1and 2. F-MuLV: Friend-Murine Leukaemia Virus. MoMLV: Moloney-MurineLeukaemia Virus. BAEV: Baboon Endogenous Virus. GaLV/Gibbon ApeLeukaemia Virus. HUMER41: Human Endogenous Retroviral sequence, clone41. IAP: Intracisternal A-type Particle. MPMV: Mason-Pfizer MonkeyVirus. HERVK10: Human Endogenous Retrovirus K10. MMTV: Mouse Mammarytumour Virus. HSERV9 (ERV9 database sequence): Human sequence ofEndogenous Retrovirus 9. MSRV: Multiple Sclerosis associated RetroVirus.SIV: Simian Immunodeficiency Virus; RTLV-H: Reverse Transcriptase-LikeViral sequence H; SFV: Simian Foamy Virus; VISNA: Visna retrovirus;SIV1: Simian Immunodeficiency Virus type 1; SRV-2: Simian Retrovirustype 2; SMRV-H: Squirrel Monkey Retrovirus H.

[0175]FIG. 46 shows the MSRV sequence in the Protease andReverse-Transcriptase regions of the pol gene. The aminoacid translationis aligned under the corresponding nucleotide sequence. The regioncorresponding to the Protease ORF cloned in a recombinant vector andexpressed in E. coli, is boxed. The regions corresponding to the A and Bfragments amplified on plasma samples from MS patients are indicated bybrackets. The Reverse-Transcriptase (RT) and RNase H (RNH) region isboxed with dotted line. The highly conserved aminoacids and/or activesites of enzyme activities of both PRT and RT (including RNH) are shownunderlined.

[0176]FIG. 47A illustrates the specific detection of MSRV-pol RNAsequence by RT-PCR in the sucrose density fraction associated withRT-activity and in MS plasma; FIG. 47B shows the RT-activity profile ona sucrose density gradient obtained with extracellular virion pelletedfrom an MS choroid-plexus culture. The photograph below shows an agarosegel loaded with PCR products amplified from round 1 (ST1.1) RT-PCRproducts with the ST1.2 primer set. From left to right: water control 1from RT-PCR step with ST1.1 set; water control 2 amplified from watercontrol 1 with ST1.2 nested primers; Molecular weight markers; Fractionn°1 to 10 corresponding to the RT-activity profile shown above; Plasmasamples C1 and C2 from healthy blood donors. Plasma samples MS1 and MS2from two MS patients.

[0177]FIG. 48 shows an example of a variant and/or recombined sequencein the region of the pol gene defined by homology with the overlappingregions described in FIG. 25, as GM3, MSRV-1 pol*, t pol and FBd3.

[0178]FIG. 49 shows the nucleotide (FIG. 49A) and amino acid (FIG. 49B)alignments of the pol region between clones 1, 5 and 8 of the samepatient (Experiment 46-7).

[0179]FIG. 50 shows the nucleotide (FIG. 50A) and amino acid (FIG. 50B)alignments of the pol region between clones 41, 43 and 42 of the samepatient (Experiment 68-1).

[0180]FIG. 51 shows the nucleotide (FIG. 51A) and amino acid (FIG. 51B)alignments of the pol region between the consensus sequence (SEQ ID NO:135) of clones 1, 5 and 8 of the same patient (Experiment 46-7) and SEQID NO: 1, and between their corresponding peptide sequences.

[0181]FIG. 52 shows the nucleotide (FIG. 52A) and amino acid (FIG. 52B)alignments of the pol region between the consensus sequence (SEQ ID NO:128) of clones 41, 43 and 42 of the same patient (Experiment 68-1) andSEQ ID NO: 1, and between their corresponding peptide sequences.

[0182]FIG. 53 shows the nucleotide (FIG. 53A) and amino acid (FIG. 53B)alignments of the pol region between the consensus sequence (SEQ ID NO:135) of clones 1, 5 and 8 of the same patient (Experiment 46-7) and theconsensus sequence (SEQ ID NO: 128) of clones 41, 43 and 42 of the samepatient (Experiment 68-1).

[0183] Table 5 (at the end of the description) shows the sequencesobtained by RT-PCR with degenerate pol primers on sucrose densitygradient fractions containing the peak of RT-activity or its negativecontrol (cf Example 18); and

[0184] Table 6 (at the end of the description) shows the clinical dataand results of MSRV-cpol detection by “Pan-retro” PCR with specificELOSA assay, on CSF from MS and control patients (cf Example 18).

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS EXAMPLE 1 Obtaining ClonesDesignated MSRV-1B and MSRV-2B, Defining, Respectively, a RetrovirusMSRV-1 and a Coinfective Agent MSRV2, by “Nested” PCR Amplification ofthe Conserved POL Regions of Retroviruses on Virion PreparationsOriginating from the LM7PC and PLI-2 Lines

[0185] A PCR technique derived from the technique published by Shih (12)was used. This technique enables all trace of contaminant DNA to beremoved by treating all the components of the reaction medium withDNase. It concomitantly makes it possible, by the use of different butoverlapping primers in two successive series of PCR amplificationcycles, to increase the chances of amplifying a cDNA synthesized from anamount of RNA which is small at the outset and further reduced in thesample by the spurious action of the DNAse on the RNA. In effect, theDNase is used under conditions of activity in excess which enable alltrace of contaminant DNA to be removed before inactivation of thisenzyme remaining in the sample by heating to 85° C. for 10 minutes. Thisvariant of the PCR technique described by Shih (12) was used on a cDNAsynthesized from the nucleic acids of fractions of infective particlespurified on a sucrose gradient according to the technique described byH. Perron (13) from the “POL-2” isolate (ECACC No. V92072202) producedby the PLI-2 line (ECACC No. 92072201) on the one hand, and from theMS7PG isolate (ECACC No. V93010816) produced by the LM7PC line (ECACCNo. 93010817) on the other hand. These cultures were obtained accordingto the methods which formed the subject of the patent applicationspublished under Nos WO 93/20188 and WO 93/20189.

[0186] After cloning the products amplified by this technique with theTA Cloning Kit™ and analysis of the sequence using an Applied Biosystemsmodel 373A Automatic Sequencer, the sequences were analysed using theGeneworks® software on the latest available version of the GenBank™ databank.

[0187] The sequences cloned and sequenced from these samples correspond,in particular, to two types of sequence: a first type of sequence, to befound in the majority of the clones (55% of the clones originating fromthe POL-2 isolates of the PLI-2 culture, and 67% of the clonesoriginating from the MS7PG isolates of the LM7PC cultures), whichcorresponds to a family of “pol” sequences closely similar to, butdifferent from, the endogenous human retrovirus designated ERV-9 orHSERV-9, and a second type of sequence which corresponds to sequencesvery strongly homologous to a sequence attributed to another infectiveand/or pathogenic agent designated MSRV-2.

[0188] The first type of sequence, representing the majority of theclones, consists of sequences whose variability enables four subfamiliesof sequences to be defined. These subfamilies are sufficiently similarto one another for it to be possible to consider them to bequasi-species originating from the same retrovirus, as is well known forthe HIV-1 retrovirus (14), or to be the outcome of interference withseveral endogenous proviruses coregulated in the producing cells. Thesemore or less defective endogenous elements are sensitive to the sameregulatory signals possibly generated by a replicative provirus, sincethey belong to the same family of endogenous retroviruses (15). This newfamily of endogenous retroviruses, or alternatively this new retroviralspecies from which the generation of quasi-species has been obtained inculture, and which contains a consensus of the sequences describedbelow, is designated MSRV-1 B.

[0189]FIG. 1 presents the general consensus sequences of the sequencesof the different MSRV-1B clones sequenced in this experiment, thesesequences being identified, respectively, by SEQ ID NO: 3, SEQ ID NO: 4,SEQ ID NO: 5 and SEQ ID NO: 6. These sequences display a homology withrespect to nucleic acids ranging from 70% to 88% with the HSERV9sequence referenced X57147 and M37638 in the GenBank® data base. Four“consensus” nucleic acid sequences representative of differentquasi-species of a possibly exogenous retrovirus MSRV-1B, or ofdifferent subfamilies of an endogenous retrovirus MSRV-1B, have beendefined. These representative consensus sequences are presented in FIG.2, with the translation into amino acids. A functional reading frameexists for each subfamily of these MSRV-1B sequences, and it can be seenthat the functional open reading frame corresponds in each instance tothe amino acid sequence appearing on the second line under the nucleicacid sequence. The general consensus of the MSRV-1B sequence, identifiedby SEQ ID NO: 7 and obtained by this PCR technique in the “pol” region,is presented in FIG. 1.

[0190] The second type of sequence representing the majority of theclones sequenced is represented by the sequence MSRV-2B presented inFIG. 3 and identified by SEQ ID NO: 11. The differences observed in thesequences corresponding to the PCR primers are explained by the use ofdegenerate primers in mixture form used under different technicalconditions.

[0191] The MSRV-2B sequence (SEQ ID NO: 11) is sufficiently divergentfrom the retroviral sequences already described in the data banks for itto be suggested that the sequence region in question belongs to a newinfective agent, designated MSRV-2. This infective agent would, inprinciple, on the basis of the analysis of the first sequences obtained,be related to a retrovirus but, in view of the technique used forobtaining this sequence, it could also be a DNA virus whose genome codesfor an enzyme which incidentally possesses reverse transcriptaseactivity, as is the case, for example, with the hepatitis B virus, HBV(12). Furthermore, the random nature of the degenerate primers used forthis PCR amplification technique may very well have permitted, as aresult of unforeseen sequence homologies or of conserved sites in thegene for a related enzyme, the amplification of a nucleic acidoriginating from a prokaryotic or eukaryotic pathogenic and/orcoinfective agent (protist).

EXAMPLE 2 Obtaining Clones Designated MSRV-1B and MSRV-2B, Defining aFamily MSRV-1 and MSRV-2, by “Nested” PCR Amplification of the ConservedPOL Regions of Retroviruses on Preparations of B Lymphocytes from a NewCase of MS

[0192] The same PCR technique, modified according to the technique ofShih (12), was used to amplify and sequence the RNA nucleic acidmaterial present in a purified fraction of virions at the peak of“LM7-like” reverse transcriptase activity on a sucrose gradientaccording to the technique described by H. Perron (13), and according tothe protocols mentioned in Example 1, from a spontaneous lymphoblastoidline obtained by self-immortalization in culture of B lymphocytes froman MS patient who was seropositive for the Epstein-Barr virus (EBV),after setting up the blood lymphoid cells in culture in a suitableculture medium containing a suitable concentration of cyclosporin A. Arepresentation of the reverse transcriptase activity in the sucrosefractions taken from a purification gradient of the virions produced bythis line is presented in FIG. 4. Similarly, the culture supernatants ofa B line obtained under the same conditions from a control free from MSwere treated under the same conditions, and the assay of reversetranscriptase activity in the sucrose gradient fractions proved negativethroughout (background), and is presented in FIG. 5. Fraction 3 of thegradient corresponding to the MS B line and the same fraction withoutreverse transcriptase activity of the non-MS control gradient wereanalysed by the same RT-PCR technique as before, derived from Shih (12),followed by the same steps of cloning and sequencing as described inExample 1.

[0193] It is particularly noteworthy that the MSRV-1 and MSRV-2 typesequences are to be found only in the material associated with a peak of“LM7-like” reverse transcriptase activity originating from the MS Blymphoblastoid line. These sequences were not to be found with thematerial from the control (non-MS) B lymphoblastoid line in 26recombinant clones taken at random. Only Mo-MuLV type contaminantsequences, originating from the commercial reverse transcriptase usedfor the cDNA synthesis step, and sequences without any particularretroviral analogy were to be found in this control, as a result of the“consensus” amplification of homologous polymerase sequences which isproduced by this PCR technique. Furthermore, the absence of aconcentrated target which competes for the amplification reaction in thecontrol sample permits the amplification of dilute contaminants. Thedifference in results is manifestly highly significant (chi-squared,p<0.001).

EXAMPLE 3 Obtaining a Clone PSJ17, Defining a Retrovirus MSRV-1, byReaction of Endogenous Reverse Transcriptase with a Virion PreparationOriginating from the PLI-2 Line

[0194] This approach is directed towards obtaining reverse-transcribedDNA sequences from the supposedly retroviral RNA in the isolate usingthe reverse transcriptase activity present in this same isolate. Thisreverse transcriptase activity can theoretically function only in thepresence of a retroviral RNA linked to a primer tRNA or hybridized withshort strands of DNA already reverse-transcribed in the retroviralparticles (16). Thus, the obtaining of specific retroviral sequences ina material contaminated with cellular nucleic acids was optimizedaccording to these authors by means of the specific enzymaticamplification of the portions of viral RNAs with a viral reversetranscriptase activity. To this end, the authors determined theparticular physicochemical conditions under which this enzymaticactivity of reverse transcription on RNAs contained in virions could beeffective in vitro. These conditions correspond to the technicaldescription of the protocols presented below (endogenous RT reaction,purification, cloning and sequencing).

[0195] The molecular approach consisted in using a preparation ofconcentrated but unpurified virion obtained from the culturesupernatants of the PLI-2 line, prepared according to the followingmethod: the culture supernatants are collected twice weekly,precentrifuged at 10,000 rpm for 30 minutes to remove cell debris andthen frozen at −80° C. or used as they are for the following steps. Thefresh or thawed supernatants are centrifuged on a cushion of 30%glycerol-PBS at 100,000 g (or 30,000 rpm in a type 45 T LKB-HITACHIrotor) for 2 h at 4° C. After removal of the supernatant, the sedimentedpellet is taken up in a small volume of PBS and constitutes the fractionof concentrated but unpurified virion. This concentrated but unpurifiedviral sample was used to perform a so-called endogenous reversetranscription reaction, as described below.

[0196] A volume of 200 ml of virion purified according to the protocoldescribed above, and containing a reverse transcriptase activity ofapproximately 1-5 million dpm, is thawed at 37° C. until a liquid phaseappears, and then placed on ice. A 5-fold concentrated buffer wasprepared with the following components: 500 mM Tris-HCl pH 8.2; 75 mMNaCl; 25 mM MgCl₂; 75 mM DTT and 0.10% NP 40; 100 ml of 5×buffer+25 mlof a 100 mM solution of dATP+25 ml of a 100 mM solution of dTTP+25 ml ofa 100 mM solution of dGTP+25 ml of a 100 mM solution of dCTP+100 ml ofsterile distilled water+200 ml of the virion suspension (RT activity of5 million DPM) in PBS were mixed and incubated at 42° C. for 3 hours.After this incubation, the reaction mixture is added directly to abuffered phenol/chloroform/isoamyl alcohol mixture (Sigma ref. P 3803);the aqueous phase is collected and one volume of sterile distilled wateris added to the organic phase to re-extract the residual nucleic acidmaterial. The collected aqueous phases are combined, and the nucleicacids contained are precipitated by adding 3M sodium acetate pH 5.2 to1/10 volume+2 volumes of ethanol+1 ml of glycogen (Boehringer-Mannheimref. 901 393) and placing the sample at −20° C. for 4 h or overnight at+4° C. The precipitate obtained after centrifugation is then washed with70% ethanol and resuspended in 60 ml of distilled water. The products ofthis reaction were then purified, cloned and sequenced according to theprotocol which will now be described: blunt-ended DNAs with unpairedadenines at the ends were generated: a “filling-in” reaction was firstperformed: 25 ml of the previously purified DNA solution were mixed with2 ml of a 2.5 mM solution containing, in equimolar amounts,dATP+dGTP+dTTP+dCTP/1 ml of T4 DNA polymerase (Boehringer-Mannheim ref.1004 786)/5 ml of 10×“incubation buffer for restriction enzyme”(Boehringer-Mannheim ref. 1417 975)/1 ml of a 1% bovine serum albuminsolution/16 ml of sterile distilled water. This mixture was incubatedfor 20 minutes at 11° C. 50 ml of TE buffer and 1 ml of glycogen(Boehringer-Mannheim ref. 901 393) were added thereto before extractionof the nucleic acids with phenol/chloroform/isoamyl alcohol (Sigma ref.P 3803) and precipitation with sodium acetate as described above. TheDNA precipitated after centrifugation is resuspended in 10 ml of 10 mMTris buffer pH 7.5. 5 ml of this suspension were then mixed with 20 mlof 5×Taq buffer, 20 ml of 5 mM dATP, 1 ml (5U) of Taq DNA polymerase(Amplitaq™) and 54 ml of sterile distilled water. This mixture isincubated for 2 h at 75° C. with a film of oil on the surface of thesolution. The DNA suspended in the aqueous solution drawn off under thefilm of oil after incubation is precipitated as described above andresuspended in 2 ml of sterile distilled water. The DNA obtained wasinserted into a plasmid using the TA Cloning™ kit. The 2 ml of DNAsolution were mixed with 5 ml of sterile distilled water, 1 ml of a10-fold concentrated ligation buffer “10×LIGATION BUFFER”, 2 ml of “pCR™VECTOR” (25 ng/ml) and 1 ml of “TA DNA LIGASE”. This mixture wasincubated overnight at 12° C. The following steps were carried outaccording to the instructions of the TA Cloning kit (BritishBiotechnology). At the end of the procedure, the white colonies ofrecombinant bacteria (white) were picked out in order to be cultured andto permit extraction of the plasmids incorporated according to theso-called “miniprep” procedure (17). The plasmid preparation from eachrecombinant colony was cut with a suitable restriction enzyme andanalysed on agarose gel. Plasmids possessing an insert detected under UVlight after staining the gel with ethidium bromide were selected forsequencing of the insert, after hybridization with a primercomplementary to the Sp6 promoter present on the cloning plasmid of theTA cloning™ kit. The reaction prior to sequencing was then performedaccording to the method recommended for the use of the sequencing kit“Prism ready reaction kit dye deoxyterminator cycle sequencing kit”(Applied Biosystems, ref. 401384), and automatic sequencing was carriedout with an Applied Biosystems “Automatic Sequencer, model 373 A”apparatus according to the manufacturer's instructions.

[0197] Discriminating analysis on the computerized data banks of thesequences cloned from the DNA fragments present in the reaction mixtureenabled a retroviral type sequence to be revealed. The correspondingclone PSJ17 was completely sequenced, and the sequence obtained,presented in FIG. 6 and identified by SEQ ID NO: 9, was analysed usingthe “Geneworks®” software on the updated “GenBank™” data banks. Anidentical sequence already described could not be found by analysis ofthe data banks. Only a partial homology with some known retroviralelements was to be found. The most useful relative homology relates toan endogenous retrovirus designated ERV-9, or HSERV-9, according to thereferences (18).

EXAMPLE 4 PCR Amplification of the Nucleic Acid Sequence ContainedBetween the 5′ Region Defined by the Clone “POL MSRV-1B” and the 3′Region Defined by the Clone PSJ17

[0198] Five oligonucleotides, M001, M002-A, M003-BCD, P004 and P005,were defined in order to amplify the RNA originating from purified POL-2virions. Control reactions were performed so as to check for thepresence of contaminants (reaction with water). The amplificationconsists of an RT-PCR step according to the protocol described inExample 2, followed by a “nested” PCR according to the PCR protocoldescribed in the document EP-A-0,569,272. In the first RT-PCR cycle, theprimers M001 and P004 or P005 are used. In the second PCR cycle, theprimers M002-A or M003-BCD and the primer P004 are used. The primers arepositioned as follows:

[0199] Their composition is: primer M001: GGTCITICCICAIGG (SEQ ID NO:19)primer M002-A: TTAGGGATAGCCCTCATCTCT (SEQ ID NO:20) primer M003-BCD:TCAGGGATAGCCCCCATCTAT (SEQ ID NO:17) primer P004:AACCCTTTGCCACTACATCAATTT (SEQ ID NO:18) primer P005:GCGTAAGGACTCCTAGAGCTATT (SEQ ID NO:21)

[0200] The “nested” amplification product obtained, and designatedM003-P004, is presented in FIG. 7, and corresponds to the sequence SEQID NO: 8.

EXAMPLE 5 Amplification and Cloning of a Portion of the MSRV-1Retroviral Genome using a Sequence Already Identified, in a Sample ofVirus Purified at the Peak of Reverse Transcriptase Activity

[0201] A PCR technique derived from the technique published by Frohman(19) was used. The technique derived makes it possible, using a specificprimer at the 3′ end of the genome to be amplified, to elongate thesequence towards the 5′ region of the genome to be analyzed. Thistechnical variant is described in the documentation of the firm“Clontech Laboratories Inc.”, (Palo-Alto Calif., USA) supplied with itsproduct “5′-AmpliFINDER™ RACE Kit”, which was used on a fraction ofvirion purified as described above.

[0202] The specific 3′ primers used in the kit protocol for thesynthesis of the cDNA and the PCR amplification are, respectively,complementary to the following MSRV-1 sequences: cDNA:TCATCCATGTACCGAAGG (SEQ ID NO:22) amplification: ATGGGGTTCCCAAGTTCCCT(SEQ ID NO:23)

[0203] The products originating from the PCR were obtained afterpurification on agarose gel according to conventional methods (17), andthen resuspended in 10 μl of distilled water. Since one of theproperties of Taq polymerase consists in adding an adenine at the 3′ endof each of the two DNA strands, the DNA obtained was inserted directlyinto a plasmid using the TA Cloning™ kit (British Biotechnology). The 2μl of DNA solution were mixed with 5 ml of sterile distilled water, 1 μlof a 10-fold concentrated ligation buffer “10×LIGATION BUFFER”, 2 μl of“pCRT™ VECTOR” (25 ng/ml) and 1 μl of “TA DNA LIGASE”. This mixture wasincubated overnight at 12° C. The following steps were carried outaccording to the instructions of the TA Cloning™ kit (BritishBiotechnology). At the end of the procedure, the white colonies ofrecombinant bacteria (white) were picked out in order to be cultured andto permit extraction of the plasmids incorporated according to theso-called “miniprep” procedure (17). The plasmid preparation from eachrecombinant colony was cut with a suitable restriction enzyme andanalyzed on agarose gel. Plasmids possessing an insert detected under UVlight after staining the gel with ethidium bromide were selected forsequencing of the insert, after hybridization with a primercomplementary to the Sp6 promoter present on the cloning plasmid of theTA Cloning™ Kit. The reaction prior to sequencing was then performedaccording to the method recommended for the use of the sequencing kit“Prism ready reaction kit dye deoxyterminator cycle sequencing kit”(Applied Biosystems, ref. 401384), and automatic sequencing was carriedout with an Applied Biosystems “Automatic Sequencer model 373 A”apparatus according to the manufacturer's instructions.

[0204] This technique was applied first to two fractions of virionpurified as described below on sucrose from the “POL-2” isolate producedby the PLI-2 line on the one hand, and from the MS7PG isolate producedby the LM7PC line on the other hand. The culture supernatants arecollected twice weekly, precentrifuged at 10,000 rpm for 30 minutes toremove cell debris and then frozen at −80° C. or used as they are forthe following steps. The fresh or thawed supernatants are centrifuged ona cushion of 30% glycerol-PBS at 100,000 g (or 30,000 rpm in a type 45 TLKB- HITACHI rotor) for 2 h at 4° C. After removal of the supernatant,the sedimented pellet is taken up in a small volume of PBS andconstitutes the fraction of concentrated but unpurified virions. Theconcentrated virus is then applied to a sucrose gradient in sterile PBSbuffer (15 to 50% weight/weight) and ultracentrifuged at 35,000 rpm(100,000 g) for 12 h at +4° C. in a swing-out rotor. 10 fractions arecollected, and 20 ml are withdrawn from each fraction afterhomogenization to assay the reverse transcriptase activity thereinaccording to the technique described by H. Perron (3). The fractionscontaining the peak of “LM7-like” RT activity are then diluted insterile PBS buffer and ultra-centrifuged for one hour at 35,000 rpm(100,000 g) to sediment the viral particles. The pellet of purifiedvirion thereby obtained is then taken up in a small volume of a bufferwhich is appropriate for the extraction of RNA. The cDNA synthesisreaction mentioned above is carried out on this RNA extracted frompurified extracellular virion. PCR amplification according to thetechnique mentioned above enabled the clone F-1-11 to be obtained, whosesequence, identified by SEQ ID NO: 2, is presented in FIG. 8.

[0205] This clone makes it possible to define, with the different clonespreviously sequenced, a region of considerable length (1.2 kb)representative of the “pol” gene of the MSRV-1 retrovirus, as presentedin FIG. 9. This sequence, designated SEQ ID NO: 1, is reconstituted fromdifferent clones overlapping one another at their ends, correcting theartifacts associated with the primers and with the amplification orcloning techniques which would artificially interrupt the reading frameof the whole. This sequence will be identified below under thedesignation “MSRV-1 pol* region”. Its degree of homology with theHSERV-9 sequence is shown in FIG. 12.

[0206] In FIG. 9, the potential reading frame with its translation intoamino acids is presented below the nucleic acid sequence.

EXAMPLE 6 Detection of Specific MSRV-1 and MSRV-2 Sequences in DifferentSamples of Plasma Originating from Patients Suffering from MS or fromControls

[0207] A PCR technique was used to detect the MSRV-1 and MSRV-2 genomesin plasmas obtained after taking blood samples from patients sufferingfrom MS and from non-MS controls onto EDTA.

[0208] Extraction of the RNAs from plasma was performed according to thetechnique described by P. Chomzynski (20), after adding one volume ofbuffer containing guanidinium thiocyanate to 1 ml of plasma storedfrozen at −80° C. after collection.

[0209] For MSRV-2, the PCR was performed under the same conditions andwith the following primers:

[0210] 5′ primer, identified by SEQ ID NO: 14 5′ GTAGTTCGATGTAGAAAGCG3′;

[0211] 3′ primer, identified by SEQ ID NO: 13 5′ GCATCCGGCAACTGCACG 3′.

[0212] However, similar results were also obtained with the followingPCR primers in two successive amplifications by “nested” PCR on samplesof nucleic acids not treated with DNase.

[0213] The primers used for this first step of 40 cycles with ahybridization temperature of 48° C. are the following:

[0214] 5′ primer, identified by SEQ ID NO: 24 5′ GCCGATATCACCCGCCATGG3′, corresponding to a 5′ MSRV-2 PCR primer, for a first PCR on samplesfrom patients,

[0215] 3′ primer, identified by SEQ ID NO: 13 5′ GCATCCGGCAACTGCACG 3′,corresponding to a 3′ MSRV-2 PCR primer, for a first PCR on samples frompatients.

[0216] After this step, 10 ml of the amplification product are taken andused to carry out a second, so-called “nested” PCR amplification withprimers located within the region already amplified. This second steptakes place over 35 cycles, with a primer hybridization (“annealing”)temperature of 50° C. The reaction volume is 100 ml.

[0217] The primers used for this second step are the following:

[0218] 5′ primer, identified by SEQ ID NO: 25 5′ CGCGATGCTGGTTGGAGAGC3′, corresponding to a 5′ MSRV-2 PCR primer, for a nested PCR on samplesfrom patients,

[0219] 3′ primer, identified by SEQ ID NO: 26 5′ TCTCCACTCCGAATATTCCG3′, corresponding to a 3′ MSRV-2 PCR primer, for a nested PCR on samplesfrom patients.

[0220] For MSRV-1, the amplification was performed in two steps.Furthermore, the nucleic acid sample is treated beforehand with DNase,and a control PCR without RT (AMV reverse transcriptase) is performed onthe two amplification steps so as to verify that the RT-PCRamplification comes exclusively from the MSRV-1 RNA. In the event of apositive control without RT, the initial aliquot sample of RNA is againtreated with DNase and amplified again.

[0221] The protocol for treatment with DNase lacking RNAse activity isas follows: the extracted RNA is aliquoted in the presence of “RNAseinhibitor” (Boehringer-Mannheim) in water treated with DEPC at a finalconcentration of 1 mg in 10 ml; to these 10 ml, 1 ml of “RNAse-freeDNAse” (Boehringer-Mannheim) and 1.2 ml of pH 5 buffer containing 0.1M/l sodium acetate and 5 mM/l MgSO₄ is added; the mixture is incubatedfor 15 min at 20° C. and brought to 95° C. for 1.5 min in a“thermocycler”.

[0222] The first MSRV-1 RT-PCR step is performed according to a variantof the RNA amplification method as described in Patent Application No.EP-A-0,569,272. In particular, the cDNA synthesis step is performed at42° C. for one hour; the PCR amplification takes place over 40 cycles,with a primer hybridization (“annealing”) temperature of 53° C. Thereaction volume is 100 μl.

[0223] The primers used for this first step are the following:

[0224] 5′ primer, identified by SEQ ID NO: 15 5′ AGGAGTAAGGAAACCCAACGGAC3′;

[0225] 3′ primer, identified by SEQ ID NO: 16 5′ TAAGAGTTGCACAAGTGCG 3′.

[0226] After this step, 10 ml of the amplification product are taken andused to carry out a second, so-called “nested” PCR amplification withprimers located within the region already amplified. This second steptakes place over 35 cycles, with a primer hybridization (“annealing”)temperature of 53° C. The reaction volume is 100 μl.

[0227] The primers used for this second step are the following:

[0228] 5′ primer, identified by SEQ ID NO: 17 5′ TCAGGGATAGCCCCCATCTAT3′;

[0229] 3′ primer, identified by SEQ ID NO: 18 5′AACCCTTTGCCACTACATCAATTT 3′.

[0230]FIGS. 10 and 11 present the results of PCR in the form ofphotographs under ultraviolet light of ethidium bromide-impregnatedagarose gels, in which an electrophoresis of the PCR amplificationproducts applied separately to the different wells was performed.

[0231] The top photograph (FIG. 10) shows the result of specific MSRV-2amplification.

[0232] Well number 8 contains a mixture of DNA molecular weight markers,and wells 1 to 7 represent, in order, the products amplified from thetotal RNAs of plasmas originating from 4 healthy controls free from MS(wells 1 to 4) and from 3 patients suffering from MS at different stagesof the disease (wells 5 to 7).

[0233] In this series, MSRV-2 nucleic acid material is detected in theplasma of one case of MS out of the 3 tested, and in none of the 4control plasmas. Other results obtained on more extensive series confirmthese results.

[0234] The bottom photograph (FIG. 11) shows the result of specificamplification by MSRV-1 “nested” RT-PCR:

[0235] well No. 1 contains the PCR product produced with water alone,without the addition of AMV reverse transcriptase; well No. 2 containsthe PCR product produced with water alone, with the addition of AMVreverse transcriptase; well number 3 contains a mixture of DNA molecularweight markers; wells 4 to 13 contain, in order, the products amplifiedfrom the total RNAs extracted from sucrose gradient fractions (collectedin a downward direction), on which gradient a pellet of virionoriginating from a supernatant of a culture infected with MSRV-1 andMSRV-2 was centrifuged to equilibrium according to the protocoldescribed by H. Perron (13); to well 14 nothing was applied; to wells 15to 17, the amplified products of RNA extracted from plasmas originatingfrom 3 different patients suffering from MS at different stages of thedisease were applied.

[0236] The MSRV-1 retroviral genome is indeed to be found in the sucrosegradient fraction containing the peak of reverse transcriptase activitymeasured according to the technique described by H. Perron (3), with avery strong intensity (fraction 5 of the gradient, placed in well No.8). A slight amplification has taken place in the first fraction (wellNo. 4), probably corresponding to RNA released by lysed particles whichfloated at the surface of the gradient; similarly, aggregated debris hassedimented in the last fraction (tube bottom), carrying with it a fewcopies of the MSRV-1 genome which have given rise to an amplification oflow intensity.

[0237] Of the 3 MS plasmas tested in this series, MSRV-1 RNA turned upin one case, producing a very intense amplification (well No. 17).

[0238] In this series, the MSRV-1 retroviral RNA genome, probablycorresponding to particles of extracellular virus present in the plasmain extremely small numbers, was detected by “nested” RT-PCR in one caseof MS out of the 3 tested. Other results obtained on more extensiveseries confirm these results.

[0239] Furthermore, the specificity of the sequences amplified by thesePCR techniques may be verified and evaluated by the “ELOSA” technique asdescribed by F. Mallet (21) and in the document FR-A-2,663,040.

[0240] For MSRV-1, the products of the nested PCR described above may betested in two ELOSA systems enabling a consensus A and a consensus B+C+Dof MSRV-1 to be detected separately, corresponding to the subfamiliesdescribed in Example 1 and FIGS. 1 and 2. In effect, the sequencesclosely resembling the consensus B+C+D are to be found essentially inthe RNA samples originating from MSRV-1 virions purified from culturesor amplified in extracellular biological fluids of MS patients, whereasthe sequences closely resembling the consensus A are essentially to befound in normal human cellular DNA.

[0241] The ELOSA/MSRV-1 system for the capture and specifichybridization of the PCR products of the subfamily A uses a captureoligonucleotide cpV1A with an amine bond at the 5′ end and abiotinylated detection oligonucleotide dpV1A having as their sequence,respectively:

[0242] cpV1A identified by SEQ ID NO: 27

[0243]5′ GATCTAGGCCACTTCTCAGGTCCAGS 3′, corresponding to the ELOSAcapture oligonucleotide for the products of MSRV-1 nested PCR performedwith the primers identified by SEQ ID NO: 15 and SEQ ID NO: 16,optionally followed by amplification with the primers identified by SEQID NO: 17 and SEQ ID NO: 18 on samples from patients;

[0244] dpV1A identified by SEQ ID NO: 28;

[0245]5′ CATCTITTTGGICAGGCAITAGC 3′, corresponding to the ELOSA captureoligonucleotide for the subfamily A of the products of MSRV-1 “nested”PC R performed with the primers identified by SEQ ID NO: 15 and SEQ IDNO: 16, optionally followed by amplification with the primers identifiedby SEQ ID NO: 17 and SEQ ID NO: 18 on samples from patients.

[0246] The ELOSA/MSRV-1 system for the capture and specifichybridization of the PCR products of the subfamily B+C+D uses the samebiotinylated detection oligonucleotide dpV1A and a captureoligonucleotide cpV1B with an amine bond at the 5′ end having as itssequence:

[0247] dpV1B identified by SEQ ID NO: 29

[0248]5′ CTTGAGCCAGTTCTCATACCTGGA 3′, corresponding to the ELOSA captureoligonucleotide for the subfamily B+C+D of the products of MSRV-1“nested” PCR performed with the primers identified by SEQ ID NO: 15 andSEQ ID NO: 16, optionally followed by amplification with the primersidentified by SEQ ID NO: 17 and SEQ ID NO: 18 on samples from patients.

[0249] This ELOSA detection system enabled it to be verified that noneof the PCR products thus amplified from DNase-treated plasmas of MSpatients contained a sequence of the subfamily A, and that all werepositive with the consensus of the subfamilies B, C and D.

[0250] For MSRV-2, a similar ELOSA technique was evaluated on isolatesoriginating from infected cell cultures, using the following PCRamplification primers,

[0251] 5′ primer, identified by SEQ ID NO: 30

[0252] 5′ AGTGYTRCCMCARGGCGCTGAA 3′, corresponding to a 5′ MSRV-2 PCRprimer, for PCR on samples from cultures,

[0253] 3′ primer, identified by SEQ ID NO: 31

[0254] 5′ GMGGCCAGCAGSAKGTCATCCA 3′, corresponding to a 3′ MSRV-2 PCRprimer, for PCR on samples from cultures,

[0255] and the capture oligonucleotides with an amine bond at the 5′ endcpV2 and the biotinylated detection oligonucleotide dpV2 having as theirrespective sequences:

[0256] - cpV2 identified by SEQ ID NO: 32

[0257] 5 GGATGCCGCCTATAGCCTCTAC 3′, corresponding to an ELOSA captureoligonucleotide for the products of MSRV-2 PCR performed with theprimers SEQ ID NO: 34 and SEQ ID NO: 35, or optionally with thedegenerate primers defined by Shih (12).

[0258] dpV2 identified by SEQ ID NO: 33

[0259] 5′ AAGCCTATCGCGTGCAGTTGCC 3′, corresponding to an ELOSA detectionoligonucleotide for the products of MSRV-2 PCR performed with theprimers SEQ ID NO: 30 and SEQ ID NO: 3 1, or optionally with thedegenerate primers defined by Shih (12)

[0260] This PCR amplification system with a pair of primers differentfrom those which were described previously for amplification on thesamples from patients made it possible to confirm the infection withMSRV-2 of in vitro cultures and of samples of nucleic acids used for themolecular biology studies.

[0261] All things considered, the first results of PCR detection of thegenome of pathogenic and/or infective agents show that it is possiblethat free “virus” may circulate in the blood stream of patients in anacute, virulent phase, outside the nervous system. This is compatiblewith the almost invariable presence of “gaps” in the blood-brain barrierof patients in an active phase of MS.

[0262] EXAMPLE 7

Obtaining Sequences of the “env” Gene of the MSRV-1 Retroviral Genome

[0263] As has already been described in Example 5, a PCR techniquederived from the technique published by Frohman (19) was used. Thetechnique derived makes it possible, using a specific primer at the 3′end of the genome to be amplified, to elongate the sequence towards the5′ region of the genome to be analysed. This technical variant isdescribed in the documentation of “Clontech Laboratories Inc.,(Palo-Alto Calif., USA) supplied with its product “5′-AmpliFINDER™ RACEKit”, which was used on a fraction of virion purified as describedabove.

[0264] In order to carry out an amplification of the 3′ region of theMSRV-1 retroviral genome encompassing the region of the “env” gene, astudy was carried out to determine a consensus sequence in the LTRregions of the same type as those of the defective endogenous retrovirusHSERV-9 (18, 24), with which the MSRV-1 retrovirus displays partialhomologies.

[0265] The same specific 3′ primer was used in the kit protocol for thesynthesis of the cDNA and the PCR amplification; its sequence is asfollows:

[0266] GTGCTGATTGGTGTATTTACAATCC (SEQ ID NO 41)

[0267] Synthesis of the complementary DNA (cDNA) and unidirectional PCRamplification with the above primer were carried out in one stepaccording to the method described in Patent EP-A-0,569,272.

[0268] The products originating from the PCR were extracted afterpurification of agarose gel according to conventional methods (17), andthen resuspended in 10 μl of distilled water. Since one of theproperties of Taq polymerase consists in adding an adenine at the 3′ endof each of the two DNA strands, the DNA obtained was inserted directlyinto a plasmid using the TA Cloning™ kit (British Biotechnology). The 2μl of DNA solution were mixed with 5 μl of sterile distilled water, 1 μlof a 10-fold concentrated ligation buffer “10×LIGATION BUFFER”, 2 μl of“PCR™ VECTOR” (25 ng/ml) and 1 μl of “TA DNA LIGASE”. This mixture wasincubated overnight at 12° C. The following steps were carried outaccording to the instructions of the TA Cloning® kit (BritishBiotechnology). At the end of the procedure, the white colonies ofrecombinant bacteria (white) were picked out in order to be cultured andto permit extraction of the plasmids incorporated according to theso-called “miniprep” procedure (17). The plasmid preparation from eachrecombinant colony was cut with a suitable restriction enzyme andanalyzed on agarose gel. Plasmids possessing an insert detected under UVlight after staining the gel with ethidium bromide were selected forsequencing of the insert, after hybridization with a primercomplementary to the Sp6 promoter present on the cloning plasmid of theTA Cloning™ Kit. The reaction prior to sequencing was then performedaccording to the method recommended for the use of the sequencing kit“Prism ready reaction kit dye deoxyterminator cycle sequencing kit”(Applied Biosystems, ref. 401384), and automatic sequencing was carriedout with an Applied Biosystems “automatic sequencer, model 373 A”apparatus according to the manufacturer's instructions.

[0269] This technical approach was applied to a sample of virionconcentrated as described below from a mixture of culture supernatantsproduced by B lymphoblastoid lines such as are described in Example 2,established from lymphocytes of patients suffering from MS andpossessing reverse transcriptase activity which is detectable accordingto the technique described by Perron et al. (3): the culturesupernatants are collected twice weekly, precentrifuged at 10,000 rpmfor 30 minutes to remove cell debris and then frozen at −80° C. or usedas they are for the following steps. The fresh or thawed supernatantsare centrifuged on a cushion of 30% glycerol-PBS at 100,000 g for 2 h at4° C. After removal of the supernatant, the sedimented pelletconstitutes the sample of concentrated but unpurified virions. Thepellet thereby obtained is then taken up in a small volume of anappropriate buffer for the extraction of RNA. The cDNA synthesisreaction mentioned above is carried out on this RNA extracted fromconcentrated extracellular virion.

[0270] RT-PCR amplification according to the technique mentioned aboveenabled the clone FBd3 to be obtained, whose sequence, identified by SEQID NO: 42, is presented in FIG. 13.

[0271] In FIG. 14, the sequence homology between the clone FBd3 and theHSERV-9 retrovirus is shown on the matrix chart by a continuous line forany partial homology greater than or equal to 65%. It can be seen thatthere are homologies in the flanking regions of the clone (with the polgene at the 5′ end and with the env gene and then the LTR at the 3′end), but that the internal region is totally divergent and does notdisplay any homology, even weak, with the “env” gene of HSERV9.Furthermore, it is apparent that the clone FBd3 contains a longer “env”region than the one which is described for the defective endogenousHSERV-9; it may thus be seen that the internal divergent regionconstitutes an “insert” between the regions of partial homology with theHSERV-9 defective genes.

EXAMPLE 8 Amplification, Cloning and S equencing of the Region of theMSRV-1 Retroviral Genome Located Between the Clones PSJ17 and FBd3

[0272] Four oligonucleotides, F1, B4, F6 and B1, were defined foramplifying RNA originating from concentrated virions of the strains POL2and MS7PG. Control reactions were performed so as to check for thepresence of contaminants (reaction with water). The amplificationconsists of a first step of RT-PCR according to the protocol describedin Patent Application EP-A-0,569,272, followed by a second step of PCRperformed on 10 μl of product of the first step with primers internal tothe amplified first region (“nested” PCR). In the first RT-PCR cycle,the primers F1 and B4 are used. In the second PCR cycle, the primers F6and the primer B1 are used. The primers are positioned as follows:

[0273] Their composition is: primer F1: TGATGTGAACGGCATACTCACTG (SEQ IDNO:43) primer B4: CCCAGAGGTTAGUAACTCCCTTTC (SEQ ID NO 44) primer F6:GCTAAAGGAGACTTGTGQTTGTCAG (SEQ ID NO 45) primer B1:CAACATGGGCATTTCGGATTAG (SEQ ID NO 46)

[0274] The product of “nested” amplification obtained and designated “tpol” is presented in FIG. 15, and corresponds to the sequence SEQ ID NO:47.

EXAMPLE 9 Obtaining New Sequences, Expressed as RNA in Cells in CultureProducing MSRV-1, and Comprising an “env” Region of the MSRV-1Retroviral Genome

[0275] A library of cDNA was produced according to the proceduredescribed by the manufacturer of the “cDNA synthesis module, cDNA rapidadaptator ligation module, cDNA rapid cloning module and lambda gt10 invitro packaging module” kits (Amersham, ref RPN1256Y/Z, RPN1712,RPN1713, RPN1717, N334Z), from the messenger RNA extracted from cells ofa B lymphoblastoid line such as is described in Example 2, establishedfrom the lymphocytes of a patient suffering from MS and possessingreverse transcriptase activity which is detectable according to thetechnique described by Perron et al. (3).

[0276] Oligonucleotides were defined for amplifying the cDNA cloned intothe nucleic acid library between the 3′ region of the clone PSJ17 (pol)and the 5′(LTR) region of the clone FBd3. Control reactions wereperformed so as to check for the presence of contaminants (reaction withwater). PCR reactions performed on the nucleic acids cloned into thelibrary with different pairs of primers enabled a series of cloneslinking pol sequences to the MSRV-1 type env or LTR sequences to beamplified.

[0277] Two clones are representative of the sequences obtained in thecellular cDNA library:

[0278] the clone JLBc1, whose sequence SEQ ID NO: 48 is presented inFIG. 16;

[0279] the clone JLBc2, whose sequence SEQ ID NO: 49 is presented inFIG. 17.

[0280] The sequences of the clones JLBc1 and JLBc2 are homologous tothat of the clone FBd3, as is apparent in FIGS. 18 and 19. The homologybetween the clone JLBc1 and the clone JLBc2 is shown in FIG. 20.

[0281] The homologies between the clones JLBc1 and JLBc2 on the one handand the HSERV9 sequence on the other hand are presented, respectively,in FIGS. 21 and 22.

[0282] It will be noted that the region of homology between JLB1, JLB2and FBd3 comprises, with a few sequence and size variations of the“insert”, the additional sequence absent (“inserted”) in the HSERV-9 envsequence, as described in Example 8.

[0283] It will also be noted that the cloned “pol” region is veryhomologous to HSERV-9, does not possess a reading frame (bearing in mindthe sequence errors induced by the techniques used, including even theautomatic sequencer) and diverges from the MSRV-1 sequences obtainedfrom virions. In view of the fact that these sequences were cloned fromthe RNA of cells expressing MSRV-1 particles, it is probable that theyoriginate from endogenous retroviral elements related to the ERV9family; this is all the more likely for the fact that the pol and envgenes are present on the same RNA which is clearly not the MSRV-1genomic RNA. Some of these ERV9 elements possess functional LTRs whichcan be activated by replicative viruses coding for homologous orheterologous transactivators. Under these conditions, the relationshipbetween MSRV-1 and HSERV-9 makes probable the transactivation of thedefective (or otherwise) endogenous ERV9 elements by homologous, or evenidentical, MSRV-1 transactivating proteins.

[0284] Such a phenomenon may induce a viral interference between theexpression of MSRV-1 and the related endogenous elements. Such aninterference generally leads to a so-called “defective-interfering”expression, some features of which were to be found in theMSRV-1-infected cultures studied. Furthermore, such a phenomenon doesnot lack generation of the expression of polypeptides, or even ofendogenous retroviral proteins which are not necessarily tolerated bythe immune system. Such a scheme of aberrant expression of endogenouselements related to MSRV-1 and induced by the latter is liable tomultiply the aberrant antigens, and hence to contribute to the inductionof autoimmune processes such as are observed in MS.

[0285] It is, however, essential to note that the clones JLBc1 and JLBc2differ from the ERV9 or HSERV9 sequence already described, in that theypossess a longer env region comprising an additional region totallydivergent from ERV9. Their kinship with the endogenous ERV9 family mayhence be defined, but they clearly constitute novel elements neverhitherto described. In effect, interrogation of the data banks ofnucleic acid sequences available in version No. 15 (1995) of the“Entrez” software (NCBI, NIH, Bethesda, USA) did not enable a knownhomologous sequence in the env region of these clones to be identified.

EXAMPLE 10 Obtaining Sequences Located in the 5′ pol and 3′ gag Regionof the MSRV-1 Retroviral Genome

[0286] As has already been described in Example 5, a PCR techniquederived from the technique published by Frohman (19) was used. Thetechnique derived makes it possible, using a specific primer at the 3′end of the genome to be amplified, to elongate the sequence towards the5′ region of the genome to be analyzed. This technical variant isdescribed in the documentation of the firm Clontech Laboratories Inc.,(Palo-Alto Calif., USA) supplied with its product “5′-AmpliFINDER™ RACEKit”, which was used on a fraction of virion purified as describedabove.

[0287] In order to carry out an amplification of the 5′ region of theMSRV-1 retroviral genome starting from the pol sequence alreadysequenced (clone F11-1) and extending towards the gag gene, MSRV-1specific primers were defined.

[0288] The specific 3′ primers used in the kit protocol for thesynthesis of the cDNA and the PCR amplification are, respectively,complementary to the following MSRV-1 sequences: cDNA:CCTGAGTTCTTGCACTAACCC (SEQ ID NO:50) amplification:GTCCGTTGGGTTTCCTTACTCCT (SEQ ID NO:51)

[0289] The products originating from the PCR were extracted afterpurification on agarose gel according to conventional methods (17), andthen resuspended in 10 ml of distilled water. Since one of theproperties of Taq polymerase consists in adding an adenine at the 3′ endof each of the two DNA strands, the DNA obtained was inserted directlyinto a plasmid using the TA Cloning™ kit (British Biotechnology). The 2ml of DNA solution were mixed with 5 ml of sterile distilled water, 1 mlof a 10-fold concentrated ligation buffer “10×LIGATION BUFFER”, 2 ml of“pCR™ VECTOR” (25 ng/ml) and 1 ml of “TA DNA LIGASE”. This mixture wasincubated overnight at 12° C. The following steps were carried outaccording to the instructions of the TA Cloning® kit (BritishBiotechnology). At the end of the procedure, the white colonies ofrecombinant bacteria (white) were picked out in order to be cultured andto permit extraction of the plasmids incorporated according to theso-called “miniprep” procedure (17). The plasmid preparation from eachrecombinant colony was cut with a suitable restriction enzyme andanalyzed on agarose gel. Plasmids possessing an insert detected under UVlight after staining the gel with ethidium bromide were selected forsequencing of the insert, after hybridization with a primercomplementary to the Sp6 promoter present on the cloning plasmid of theTA Cloning™ Kit. The reaction prior to sequencing was then performedaccording to the method recommended for the use of the sequencing kit“Prism ready reaction kit dye deoxyterminator cycle sequencing kit”(Applied Biosystems, ref. 401384), and automatic sequencing was carriedout with an Applied Biosystems “automatic sequencer model 373 A”apparatus according to the manufacturer's instructions.

[0290] This technical approach was applied to a sample of virionconcentrated as described below from a mixture of culture supernatantsproduced by B lymphoblastoid lines such as are described in Example 2,established from lymphocytes of patients suffering from MS andpossessing reverse transcriptase activity which is detectable accordingto the technique described by Perron et al. (3): the culturesupernatants are collected twice weekly, precentrifuged at 10,000 rpmfor 30 minutes to remove cell debris and then frozen at −80° C. or usedas they are for the following steps. The fresh or thawed supernatantsare centrifuged on a cushion of 30% glycerol-PBS at 100,000 g for 2 h at4° C. After removal of the supernatant, the sedimented pelletconstitutes the sample of concentrated but unpurified virions. Thepellet thereby obtained is then taken up in a small volume of anappropriate buffer for the extraction of RNA. The cDNA synthesisreaction mentioned above is carried out on this RNA extracted fromconcentrated extracellular virion.

[0291] RT-PCR amplification according to the technique mentioned aboveenabled the clone GM3 to be obtained, whose sequence, identified by SEQID NO 52, is presented in FIG. 23.

[0292] In FIG. 24, the sequence homology between the clone GMP3 and theHSERV-9 retrovirus is shown on the matrix chart by a continuous line,for any partial homology greater than or equal to 65%.

[0293] In summary, FIG. 25 shows the localization of the differentclones studied above, relative to the known ERV9 genome. In FIG. 25,since the MSRV-1 env region is longer than the reference ERV9 env gene,the additional region is shown above the point of insertion according toa “V”, on the understanding that the inserted material displays asequence and size variability between the clones shown (JLBc1, JLBc2,FBd3). And FIG. 26 shows the position of different clones studied in theMSRV-1 pol* region.

[0294] By means of the clone GM3 described above, a possible readingframe could be defined, covering the whole of the pol gene, referencedaccording to SEQ ID NO: 57, shown in the successive FIGS. 27a to 27c.

EXAMPLE 11 Detection of ANTI-MSRV-1 Specific Antibodies in Human Serum

[0295] Identification of the sequence of the pol gene of the MSRV-1retrovirus and of an open reading frame of this gene enabled the aminoacid sequence SEQ ID NO: 35 of a region of the said gene, referenced SEQID NO: 36, to be determined (see FIG. 28).

[0296] Different synthetic peptides corresponding to fragments of theprotein sequence of MSRV-1 reverse transcriptase encoded by the pol genewere tested for their antigenic specificity with respect to sera ofpatients suffering from MS and of healthy controls.

[0297] The peptides were synthesized chemically by solid-phase synthesisaccording to the Merrifield technique (Barany G, and Merrifield R. B,1980, In the Peptides, 2, 1-284, Gross E and Meienhofer J, Eds.,Academic Press, New York). The practical details are those describedbelow.

[0298] a) Peptide synthesis:

[0299] The peptides were synthesized on a phenylacetamidomethyl(PAM)/polystyrene/divinylbenzene resin (Applied Biosystems, Inc. FosterCity, Calif.), using an “Applied Biosystems 430A” automatic synthesizer.The amino acids are coupled in the form of hydroxybenzotriazole (HOBT)esters. The amino acids used are obtained from Novabiochem(Läuflerlfingen, Switzerland) or Bachem (Bubendorf, Switzerland).

[0300] The chemical synthesis was performed using a double couplingprotocol with N-methylpyrrolidone (NMP) as solvent. The peptides werecut from the resin, as well as the side-chain protective groups,simultaneously, using hydrofluoric acid (HF) in a suitable apparatus(type I cleavage apparatus, Peptide Institute, Osaka, Japan).

[0301] For 1 g of peptidyl resin, 10 ml of HF, 1 ml of anisole and 1 mlof dimethyl sulphide 5DMS are used. The mixture is stirred for 45minutes at −2° C. The HF is then evaporated off under vacuum. Afterintensive washes with ether, the peptide is eluted from the resin with10% acetic acid and then lyophilized.

[0302] The peptides are purified by preparative high performance liquidchromatography on a VYDAC C18 type column (250×21 mm) (The SeparationGroup, Hesperia, Calif., USA). Elution is carried out with anacetonitrile gradient at a flow rate of 22 m/min. The fractionscollected are monitored by an elution under isocratic conditions on aVYDACr C18 analytical column (250×4.6 mm) at a flow rate of 1 ml/min.Fractions having the same retention time are pooled and lyophilized. Thepreponderant fraction is then analysed by analytical high performanceliquid chromatography with the system described above. The peptide whichis considered to be of acceptable purity manifests itself in a singlepeak representing not less than 95% of the chromatogram.

[0303] The purified peptides are then analyzed with the object ofmonitoring their amino acid composition, using an Applied Biosystems420H automatic amino acid analyzer. Measurement of the (average)chemical molecular mass of the peptides is obtained using LSIMS massspectrometry in the positive ion mode on a VG. ZAB.ZSEQ double focusinginstrument connected to a DEC-VAX 2000 acquisition system (VG analyticalLtd, Manchester, England).

[0304] The reactivity of the different peptides was tested against seraof patients suffering from MS and against sera of healthy controls. Thisenabled a peptide designated POL2B to be selected, whose sequence isshown in FIG. 28 in the identifier SEQ ID NO: 35, below, encoded by thepol gene of MSRV-1 (nucleotides 181 to 330).

[0305] b) Antigenic properties:

[0306] The antigenic properties of the POL2B peptide were demonstratedaccording to the ELISA protocol described below.

[0307] The lyophilized POL2B peptide was dissolved in sterile distilledwater at a concentration of 1 mg/ml. This stock solution was aliquotedand kept at +4° C. for use over a fortnight, or frozen at −20° C. foruse within 2 months. An aliquot is diluted in PBS (phosphate bufferedsaline) solution so as to obtain a final peptide concentration of 1microgram/ml. 100 microlitres of this dilution are placed in each wellof microtitration plates (“high-binding” plastic, COSTAR ref: 3590). Theplates are covered with a “plate-sealer” type adhesive and keptovernight at +4° C. for the phase of adsorption of the peptide to theplastic. The adhesive is removed and the plates are washed three timeswith a volume of 300 microliters of a solution A (1×PBS, 0.05% Tween20r), then inverted over an absorbent tissue. The plates thus drainedare filled with 200 microliters per well of a solution B (solution A+10%of goat serum), then covered with an adhesive and incubated for 45minutes to 1 hour at 37° C. The plates are then washed three times withthe solution A as described above.

[0308] The test serum samples are diluted beforehand to 1/50 in thesolution B, and 100 microliters of each dilute test serum are placed inthe wells of each microtitration plate. A negative control is placed inone well of each plate, in the form of 100 microliters of buffer B. Theplates covered with an adhesive are then incubated for 1 to 3 hours at37° C. The plates are then washed three times with the solution A asdescribed above. In parallel, a peroxidase-labelled goat antibodydirected against human IgG (Sigma Immunochemicals ref. A6029) or IgM(Cappel ref. 55228) is diluted in the solution B (dilution 1/5000 forthe anti-IgG and 1/1000 for the anti-IgM). 100 microliters of theappropriate dilution of the labelled antibody are then placed in eachwell of the microtitration plates, and the plates covered with anadhesive are incubated for 1 to 2 hours at 37° C. A further washing ofthe plates is then performed as described above. In parallel, theperoxidase substrate is prepared according to the directions of the“Sigma fast OPD kit” (Sigma Immunochemicals, ref. P9187). 100microliters of substrate solution are placed in each well, and theplates are placed protected from light for 20 to 30 minutes at roomtemperature.

[0309] When the color reaction has stabilized, the plates are placedimmediately in an ELISA plate spectrophotometric reader, and the opticaldensity (OD) of each well is read at a wavelength of 492 nm.Alternatively, 30 microliters of 1N HCl are placed in each well to stopthe reaction, and the plates are read in the spectrophotometer within 24hours.

[0310] The serological samples are introduced in duplicate or intriplicate, and the optical density (OD) corresponding to the serumtested is calculated by taking the mean of the OD values obtained forthe same sample at the same dilution.

[0311] The net OD of each serum corresponds to the mean OD of the serumminus the mean OD of the negative control (solution B: PBS, 0.05% Tween20r, 10% goat serum).

[0312] c) Detection of anti-MSRV-1 IgG antibodies by ELISA:

[0313] The technique described above was used with the POLB2 peptide totest for the presence of anti-MSRV-1 specific IgG antibodies in theserum of 29 patients for whom a definite or probable diagnosis of MS wasestablished according to the criteria of Poser (23), and of 32 healthycontrols (blood donors).

[0314]FIG. 29 shows the results for each serum tested with an anti-IgGantibody. Each vertical bar represents the net optical density (OD at492 nm) of a serum tested. The ordinate axis gives the net OD at the topof the vertical bars. The first 29 vertical bars lying to the left ofthe vertical broken line represent the sera of 29 cases of MS tested,and the 32 vertical bars lying to the right of the vertical broken linerepresent the sera of 32 healthy controls (blood donors).

[0315] The mean of the net OD values for the MS sera tested is 0.62. Thediagram enables 5 controls to be revealed whose net OD rises above thegrouped values of the control population. These values may represent thepresence of specific IgGs in symptomless seropositive patients. Twomethods were hence evaluated in order to determine the statisticalthreshold of positivity of the test.

[0316] The mean of the net OD values for the controls, including thecontrols with high net OD values, is 0.36. Without the 5 controls whosenet OD values are greater than or equal to 0.5, the mean of the“negative” controls is 0.33. The standard deviation of the negativecontrols is 0.10. A theoretical threshold of positivity may becalculated according to the formula:

threshold value (mean of the net OD values of the seronegativecontrols)+(2 or 3×standard deviation of the net OD values of theseronegative controls).

[0317] In the first case, there are considered to be symptomlessseropositives, and the threshold value is equal to 0.33 +(2×0.10) 0.53.The negative results represent a non-specific “background” of thepresence of antibodies directed specifically against an epitope of thepeptide.

[0318] In the second case, if the set of controls consisting of blooddonors in apparent good health is taken as a reference basis, withoutexcluding the sera which are, on the face of it, seropositive, thestandard deviation of the “non-MS controls” is 00.116. The thresholdvalue then becomes 0.36+(2×0.116)=0.59.

[0319] According to this analysis, the test is specific for MS. In thisrespect, it is seen that the test is specific for MS, since, as shown inTable 1, no control has a net OD above this threshold. In fact, thisresult reflects the fact that the antibody titers in patients sufferingfrom MS are, for the most part, higher than in healthy controls who havebeen in contact with MSRV-1. TABLE No. 1 MS CONTROLS 0.681  0.35151.0425 0.56  0.5675 0.3565 0.63  0.449  0.588  0.2825 0.645  0.55 0.6635 0.52  0.576  0.2535 0.7765 0.55  0.5745 0.51  0.513  0.426 0.4325 0.451  0.7255 0.227  0.859  0.3905 0.6435 0.265  0.5795 0.42950.8655 0.291  0.671  0.347  0.596  0.4495 0.662  0.3725 0.602  0.181 0.525  0.2725 0.53  0.426  0.565  0.1915 0.517  0.222  0.607  0.395 0.3705 0.34  0.397  0.307  0.4395 0.219  0.491  0.2265 0.2605 MEAN 0.62 0.33  STD DEV 0.14  0.10  THRESHOLD VALUE 0.53 

[0320] In accordance with the first method of calculation, and as shownin FIG. 29 and in the corresponding Table 1, 26 of the 29 MS sera give apositive result (net OD greater than or equal to 0.50), indicating thepresence of IgGs specifically directed against the POL2B peptide, henceagainst a portion of the reverse transcriptase enzyme of the MSRV-1retrovirus encoded by its pol gene, and consequently against the MSRV-1retrovirus. Thus, approximately 90% of the MS patients tested havereacted against an epitope carried by the POL2B peptide and possesscirculating IgGs directed against the latter.

[0321] Five out of 32 blood donors in apparent good health show apositive result. Thus, it is apparent that approximately 15% of thesymptomless population may have been in contact with an epitope carriedby the POL2B peptide under conditions which have led to an activeimmunization which manifests itself in the persistence of specific serumIgGs. These conditions are compatible with an immunization against theMSRV-1 retrovirus reverse transcriptase during an infection with (and/orreactivation of) the MSRV-1 retrovirus. The absence of apparentneurological pathology recalling MS in these seropositive controls mayindicate that they are healthy carriers and have eliminated aninfectious virus after immunizing themselves, or that they constitute anat-risk population of chronic carriers. In effect, epidemiological datashowing that a pathogenic agent present in the environment of regions ofhigh prevalence of MS may be the cause of this disease imply that afraction of the population free from MS has necessarily been in contactwith such a pathogenic agent. It has been shown that the MSRV-1retrovirus constitutes all or part of this “pathogenic agent” at thesource of MS, and it is hence normal for controls taken from a healthypopulation to possess IgG type antibodies against components of theMSRV-1 retrovirus. Thus, the difference in seroprevalence between the MSand control populations is extremely significant: “chi-squared” test,p<0.001. These results hence point to an aetiopathogenic role of MSRV-1in MS.

[0322] d) Detection of anti-MSRV-1 IgM antibodies by ELISA:

[0323] The ELISA technique with the POL2B peptide was used to test forthe presence of anti-MSRV-1 IgM specific antibodies in the serum of 36patients for whom a definite or probable diagnosis of MS was establishedaccording to the criteria of Poser (23), and of 42 healthy controls(blood donors).

[0324]FIG. 30 shows the results for each serum tested with an anti-IgMantibody. Each vertical bar represents the net optical density (OD at492 nm) of a serum tested. The ordinate axis gives the net OD at the topof the vertical bars. The first 36 vertical bars lying to the left ofthe vertical line cutting the abscissa axis represent the sera of 36cases of MS tested, and the vertical bars lying to the right of thevertical broken line represent the sera of 42 healthy controls (blooddonors). The horizontal line drawn in the middle of the diagramrepresents a theoretical threshold defining the boundary of the positiveresults (in which the top of the bar lies above) and the negativeresults (in which the top of the bar lies below).

[0325] The mean of the net OD values for the MS cases tested is 0.19.

[0326] The mean of the net OD values for the controls is 0.09.

[0327] The standard deviation of the negative controls is 0.05.

[0328] In view of the small difference between the mean and the standarddeviation of the controls, the threshold of theoretical positivity maybe calculated according to the formula:

threshold value=(mean of the net OD values of the seronegative controls)+(3×standard deviation of the net OD values of the seronegativecontrols).

[0329] The threshold value is hence equal to 0.09+(3×0.05)=0.26; or, inpractice, 0.25.

[0330] The negative results represent a non-specific “background” of thepresence of antibodies directed specifically against an epitope of thepeptide.

[0331] According to this analysis, and as shown in FIG. 30 and in thecorresponding Table 2, the IgM test is specific for MS, since no controlhas a net OD above the threshold. Seven of the 36 MS sera produce apositive IgM result; now, a study of the clinical data reveals thatthese positive sera were taken during a first attack of MS or an acuteattack in untreated patients. It is known that IgMs directed againstpathogenic agents are produced during primary infections or duringreactivations following a latency phase of the said pathogenic agent.

[0332] The difference in seroprevalence between the MS and controlpopulations is extremely significant: “chi-squared” test, p<0.001.

[0333] These results point to an aetiopathogenic role of MSRV-1 in MS.

[0334] The detection of IgM and IgG antibodies against the POL2B peptideenables the course of an MSRV-1 infection and/or of the viralreactivation of MSRV-1 to be evaluated. TABLE No. 2 MS CONTROLS 0.0640.243 0.087 0.11  0.044 0.098 0.115 0.028 0.089 0.094 0.025 0.038 0.0970.176 0.108 0.146 0.018 0.049 0.234 0.161 0.274 0.113 0.225 0.079 0.3140.093 0.522 0.127 0.306 0.02  0.143 0.052 0.375 0.062 0.142 0.074 0.1570.043 0.168 0.046 1.051 0.041 0.104 0.13  0.187 0.153 0.044 0.107 0.0530.178 0.153 0.114 0.07  0.078 0.033 0.118 0.104 0.177 0.187 0.026 0.0440.024 0.053 0.046 0.153 0.116 0.07  0.04  0.033 0.028 0.973 0.073 0.0080.074 0.141 0.219 0.047 0.017 MEAN 0.19  0.09  STD. DEV. 0.23  0.05 THRESHOLD VALUE 0.26 

[0335] e) Search for immunodominant epitopes in the POL2B peptide:

[0336] In order to reduce the non-specific background and to optimizethe detection of the responses of the anti-MSRV-1 antibodies, thesynthesis of octapeptides, advancing in successive one amino acid steps,covering the whole of the sequence determined by POL2B, was carried outaccording to the protocol described below.

[0337] The chemical synthesis of overlapping octapeptides covering theamino acid sequence 61-110 shown in the identifier SEQ ID NO: 35 wascarried out on an activated cellulose membrane according to thetechnique of BERG et al. (1989. J. Ann. Chem. Soc., 111, 8024-8026)marketed by Cambridge Research Biochemicals under the trade nameSpotscan. This technique permits the simultaneous synthesis of a largenumber of peptides and their analysis.

[0338] The synthesis is carried out with esterified amino acids in whichthe α-amino group is protected with an FMOC group (Nova Biochem) and theside-chain groups with protective groups such as trityl, t-butyl esteror t-butyl ether. The esterified amino acids are solubilized inN-methylpyrrolidone (NMP) at a concentration of 300 nM, and 0.9 ml areapplied to spots of deposit of bromophenol blue. After incubation for 15minutes, a further application of amino acids is carried out accordingto another 15-minute incubation. If the coupling between two amino acidshas taken place correctly, a coloration modification (change from blueto yellow-green) is observed. After three washes in DMF, an acetylationstep is performed with acetic anhydride. Next, the terminal amino groupsof the peptides in the process of synthesis are deprotected with 20%pyridine in DMF. The spots of deposit are restained with a 1% solutionof bromophenol blue in DMF, washed three times with methanol and dried.This set of operations constitutes one cycle of addition of an aminoacid, and this cycle is repeated until the synthesis is complete. Whenall the amino acids have been added, the NH₂-terminal group of the lastamino acid is deprotected with 20% piperidine in DMF and acetylated withacetic anhydride. The groups protecting the side chain are removed witha dichloromethane/trifluoroacetic acid/triisobutylsilane (5 ml/5 ml/250ml) mixture. The immunoreactivity of the peptides is then tested byELISA.

[0339] After synthesis of the different octapeptides in duplicate on twodifferent membranes, the latter are rinsed with methanol and washed inTBS (0.1M Tris pH 7.2), then incubated overnight at room temperature ina saturation buffer. After several washes in TBS-T (0.1M Tris pH7.2-0.05% Tween 20), one membrane is incubated with a 1/50 dilution of areference serum originating from a patient suffering from MS, and theother membrane with a 1/50 dilution of a pool of sera of healthycontrols. The membranes are incubated for 4 hours at room temperature.After washes with TBS-T, a β-galactosidase-labelled anti-humanimmunoglobulin conjugate (marketed by Cambridge Research Biochemicals)is added at a dilution of 1/200, and the mixture is incubated for twohours at room temperature. After washes of the membranes with 0.05%TBS-T and PBS, the immunoreactivity in the different spots is visualizedby adding 5-bromo-4-chloro-3-indolyl b-D-galactopyranoside in potassium.The intensity of coloration of the spots is estimated qualitatively witha relative value from 0 to 5 as shown in the attached FIGS. 31 to 33.

[0340] In this way, it is possible to determine two immunodominantregions at each end of the POL2B peptide, corresponding, respectively,to the amino acid sequences 65-75 (SEQ ID NO: 37) and 92-109 (SEQ ID NO:38), according to FIG. 34, and lying, respectively, between theoctapeptides Phe-Cys-Ile-Pro-Val-Arg-Pro-Asp (FCIPVRPD) (SEQ ID NO: 146)and Arg-Pro-Asp-Ser-Gln-Phe-Leu-Phe (RPDSQFLF) (SEQ ID NO: 147), andThr-Val-Leu-Pro-Gln-Gly-Phe-Arg (TVLPQGFR) (SEQ ID NO: 148) andLeu-Phe-Gly-Gln-Ala-Leu-Ala-Gln (LFGQALAQ) (SEQ ID NO: 149), and aregion which is less reactive but apparently more specific, since itdoes not produce any background with the control serum, represented bythe octapeptides Leu-Phe-Ala-Phe-Glu-Asp-Pro-Leu (LFAFEDPL) (SEQ ID NO:39) and Phe-Ala-Phe-Glu-Asp-Pro-Leu-Asn (FAFEDPLN) (SEQ ID NO: 40).

[0341] These regions make it possible to define new peptides which aremore specific and more immunoreactive according to the usual techniques.

[0342] It is thus possible, as a result of the discoveries made and themethods developed by the inventors, to carry out a diagnosis of MSRV-1infection and/or reactivation and to evaluate a therapy in MS on thebasis of its efficacy in “negativing” the detection of these agents inthe patients' biological fluids. Furthermore, early detection inindividuals not yet displaying neurological signs of MS could make itpossible to institute a treatment which would be all the more effectivewith respect to the subsequent clinical course for the fact that itwould precede the lesion stage which corresponds to the onset ofneurological disorders. Now, at the present time, a diagnosis of MScannot be established before a symptomatology of neurological lesionshas set in, and hence no treatment is instituted before the emergence ofa clinical picture suggestive of lesions of the central nervous systemwhich are already significant. The diagnosis of an MSRV-1 and/or MSRV-2infection and/or reactivation in man is hence of decisive importance,and the present invention provides the means of doing this.

[0343] It is thus possible, apart from carrying out a diagnosis ofMSRV-1 infection and/or reactivation, to evaluate a therapy in MS on thebasis of its efficacy in “negativing” the detection of these agents inthe patients' biological fluids.

EXAMPLE 12 Obtaining a Clone LB19 Containing a Portion of the gag Geneof the MSRV-1 Retrovirus

[0344] A PCR technique derived from the technique published byGonzalez-Quintial R et al. (19) and PLAZA et al. (25) was used. From thetotal RNAs extracted from a fraction of virion purified as describedabove, the cDNA was synthesized using a specific primer (SEQ ID No.60)at the 3′ end of the genome to be amplified, using EXPAND™ REVERSETRANSCRIPTASE (BOEHRINGER MANNHEIM).

[0345] cDNA:

[0346] AAGGGGCATG GACGAGGTGG TGGCTTATTT (SEQ ID NO: 61) (antisense)

[0347] After purification, a poly(G) tail was added at the 5′ end of thecDNA using the “Terminal transferases kit” marketed by the companyBoehringer Mannheim, according to the manufacturer's protocol.

[0348] An anchoring PCR was carried out using the following 5′ and 3′primers:

[0349] AGATCTGCAGAATTCGATATCACCCCCCCCCCCCCC(SEQ ID No. 85)(sense),

[0350] and AAATGTCTGC GGCACCAATC TCCATGTT (SEQ ID No. 60) (antisense)

[0351] Next, a semi-nested anchoring PCR was carried out with thefollowing 5′ and 3′ primers:

[0352] AGATCTGCAG AATTCGATAT CA (SEQ ID No.86) (sense), and

[0353] AAATGTCTGC GGCACCAATC TCCATGTT (SEQ ID No.60) (antisense)

[0354] The products originating from the PCR were purified afterpurification on agarose gel according to conventional methods (17), andthen resuspended in 10 microliters of distilled water. Since one of theproperties of Taq polymerase consists in adding an adenine at the 3′ endof each of the two DNA strands, the DNA obtained was inserted directlyinto a plasmid using the TA Cloning™ kit (British Biotechnology). The 2μl of DNA solution were mixed with 5 μl of sterile distilled water, 1 μlof 10-fold concentrated ligation buffer “10×LIGATION BUFFER”, 2 μl of“pCR™ VECTOR” (25 ng/μl) and 1 μl of “T4 DNA LIGASE”. This mixture wasincubated overnight at 12° C. The following steps were carried outaccording to the instructions of the TA Cloning™ kit (BritishBiotechnology). At the end of the procedure, the white colonies ofrecombinant bacteria (white) were picked out in order to be cultured andto permit extraction of the plasmids incorporated according to theso-called “miniprep” procedure (17). The plasmid preparation from eachrecombinant colony was cut with a suitable restriction enzyme andanalysed on agarose gel. Plasmids possessing an insert detected under UVlight after staining the gel with ethidium bromide were selected forsequencing of the insert, after hybridization with a primercomplementary to the Sp6 promoter present on the cloning plasmid of theTA Cloning Kit™. The reaction prior to sequencing was then performedaccording to the method recommended for the use of the sequencing kit“Prism ready reaction kit dye deoxyterminator cycle sequencing kit”(Applied Biosystems, ref. 401384), and automatic sequencing was carriedout with an Applied Biosystems “Automatic Sequencer, model 373 A”apparatus according to the manufacturer's instructions.

[0355] PCR amplification according to the technique mentioned above wasused on a cDNA synthesized from the nucleic acids of fractions ofinfective particles purified on a sucrose gradient, according to thetechnique described by H. Perron (13), from culture supernatants of Blymphocytes of a patient suffering from MS, immortalized withEpstein-Barr virus (EBV) strain B95 and expressing retroviral particlesassociated with reverse transcriptase activity as described by Perron etal. (3) and in French Patent Applications MS 10, 11 and 12. the cloneLB19, whose sequence, identified by SEQ ID NO: 55, is presented in FIG.35.

[0356] The clone makes it possible to define, with the clone GM3previously sequenced and the clone G+E+A (see Example 15), a region of690 base pairs representative of a significant portion of the gag geneof the MSRV-1 retrovirus, as presented in FIG. 36. This sequencedesignated SEQ ID NO: 82 is reconstituted from different clonesoverlapping at their ends. This sequence is identified under the nameMSRV-1 “gag*” region. In FIG. 36, a potential reading frame with thetranslation into amino acids is presented below the nucleic acidsequence.

EXAMPLE 13 Obtaining a Clone FBd13 Containing a pol Gene Region Relatedto the MSRV-1 Retrovirus and an Apparently Incomplete ENV RegionContaining a Potential Reading Frame (ORF) for a Glycoprotein

[0357] Extraction of viral RNAs: The RNAs were extracted according tothe method briefly described below.

[0358] A pool of culture supernatant of B lymphocytes of patientssuffering from MS (650 ml) is centrifuged for 30 minutes at 10,000 g.The viral pellet obtained is resuspended in 300 microliters of PBS/10 mMMgCl₂. The material is treated with a DNAse (100 mg/ml)/RNAse (50 mg/ml)mixture for 30 minutes at 37° C. and then with proteinase K (50 mg/ml)for 30 minutes at 46° C.

[0359] The nucleic acids are extracted with one volume of a phenol/0.1%SDS (V/V) mixture heated to 60° C., and then re-extracted with onevolume of phenol/chloroform (1:1; V/V).

[0360] Precipitation of the material is performed with 2.5 V of ethanolin the presence of 0.1 V of sodium acetate pH5.2. The pellet obtainedafter centrifugation is resuspended in 50 microliters of sterile DEPCwater.

[0361] The sample is treated again with 50 mg/ml of “RNAse free” DNAsefor 30 minutes at room temperature, extracted with one volume ofphenol/chloroform and precipitated in the presence of sodium acetate andethanol.

[0362] The RNA obtained is quantified by an OD reading at 260 nm. Thepresence of MSRV-1 and the absence of DNA contaminant is monitored by aPCR and an MSRV-1-specific RT-PCR associated with a specific ELOSA forthe MSRV-1 genome.

[0363] Synthesis of cDNA:

[0364] 5 μg of RNA are used to synthesize a cDNA primed with a poly(DT)oligonucleotide according to the instructions of the “cDNA SynthesisModule” kit (ref RPN 1256, Amersham) with a few modifications: Thereverse transcription is performed at 45° C. instead of the recommended42° C.

[0365] The synthesis product is purified by a double extraction and adouble purification according to the manufacturer's instructions.

[0366] The presence of MSRV-1 is verified by an MSRV-1 PCR associatedwith a specific ELOSA for the MSRV-1 genome.

[0367] “Long Distance PCR”: (LD-PCR)

[0368] 500 ng of cDNA are used for the LD-PCR step (Expand Long TemplateSystem; Boehringer (ref.1681 842)).

[0369] Several pairs of oligonucleotides were used. Among these, thepair defined by the following primers: 5′ primer: GGAGAAGAGC AGCATAAGTGG (SEQ ID NO:62) 3′ primer: GTGCTGATTG GTGTATTTAC AATCC. (SEQ ID NO:63)

[0370] The amplification conditions are as follows:

[0371] 94° C. 10 seconds

[0372] 56° C. 30 seconds

[0373] 68° C. 5 minutes;

[0374] 10 cycles, then 20 cycles with an increment of 20 seconds in eachcycle on the elongation time. At the end of this first amplification, 2microliters of the amplification product are subjected to a secondamplification under the same conditions as before.

[0375] The LD-PCR reactions are conducted in a Perkin model 9600 PCRapparatus in thin-walled microtubes (Boehringer).

[0376] The amplification products are monitored by electrophoresis of⅕th of the amplification volume (10 microliters) in 1% agarose gel. Forthe pair of primers described above, a band of approximately 1.7 kb isobtained.

[0377] Cloning of the amplified fragment:

[0378] The PCR product was purified by passage through a preparativeagarose gel and then through a Costar column (Spin; D. Dutcher)according to the supplier's instructions.

[0379] 2 microliters of the purified solution are joined up with 50 ngof vector PCRII according to the supplier's instructions (TA CloningKit; British Biotechnology)).

[0380] The recombinant vector obtained is isolated by transformation ofcompetent DH5aF′ bacteria. The bacteria are selected using theirresistance to ampicillin and the loss of metabolism for Xgal (=whitecolonies). The molecular structure of the recombinant vector isconfirmed by plasmid minipreparation and hydrolysis with the enzymeEcoR1.

[0381] FBd13, a positive clone for all these criteria, was selected. Alarge-scale preparation of the recombinant plasmid was performed usingthe Midiprep Quiagen kit (ref 12243) according to the supplier'sinstructions.

[0382] Sequencing of the clone FBd13 is performed by means of the PerkinPrism Ready Amplitaq FS dye terminator kit (ref. 402119) according tothe manufacturer's instructions. The sequence reactions are introducedinto a Perkin type 377 or 373A automatic sequencer. The sequencingstrategy consists in gene walking carried out on both strands of theclone Fbd13.

[0383] The sequence of the clone FBd13 is identified by SEQ ID NO: 54.

[0384] In FIG. 37, the sequence homology between the clone FBd13 and theHSERV-9 retrovirus is shown on the matrix chart by a continuous line forany partial homology greater than or equal to 70%. It can be seen thatthere are homologies in the flanking regions of the clone (with the polgene at the 5′ end and with the env gene and then the LTR at the 3′end), but that the internal region is totally divergent and does notdisplay any homology, even weak, with the env gene of HSERV-9.Furthermore, it is apparent that the clone FBd13 contains a longer “env”region than the one which is described for the defective endogenousHSERV-9; it may thus be seen that the internal divergent regionconstitutes an “insert” between the regions of partial homology with theHSERV-9 defective genes.

[0385] This additional sequence determines apotential orf, designatedORF B13, which is represented by its amino acid sequence SEQ ID NO: 81.

[0386] The molecular structure of the clone FBd13 was analyzed using theGenWork® software and GenBank™ and SwissProt data banks.

[0387] 5 glycosylation sites were found.

[0388] The protein does not have significant homology with already knownsequences.

[0389] It is probable that this clone originates from a recombination ofan endogenous retroviral element (ERV), linked to the replication ofMSRV-1.

[0390] Such a phenomenon does not lack generation of the expression ofpolypeptides, or even of endogenous retroviral proteins which are notnecessarily tolerated by the immune system. Such a scheme of aberrantexpression of endogenous elements related to MSRV-1 and/or induced bythe latter is liable to multiply the aberrant antigens, and hence tendsto contribute to the induction of autoimmune processes such as areobserved in MS. It clearly constitutes a novel element never hithertodescribed. In effect, interrogation of the data banks of nucleic acidsequences available in version No. 19 (1996) of the “Entrez” software(NCBI, NIH, Bethesda, USA) did not enable a known homologous sequencecomprising the whole of the env region of this clone to be identified.

EXAMPLE 14 Obtaining a Clone FP6 Containing a Portion of the pol Gene,with a Region Coding for the Reverse Transcriptase Enzyme Homologous tothe Clone POL* MSRV-1, and a 3′pol Region Divergent from the EquivalentSequences Described in the Clones POL*, tpol, FBd3, JLBc1 and JLBc2

[0391] A 3′RACE was performed on total RNA extracted from plasma of apatient suffering from MS. A healthy control plasma treated under thesame conditions was used as negative control. The synthesis of cDNA wascarried out with the following modified oligo(dT) primer:

[0392] 5′ GACTCGCTGC AGATCGATTT TTTTTTTTTT TTTT 3′ (SEQ ID NO: 64)

[0393] and Boehringer “Expand RT” reverse transcriptase according to theconditions recommended by the company. A PCR was performed with theenzyme Klentaq (Clontech) under the following conditions: 94° C. 5 minthen 93° C. 1 min, 58° C. 1 min, 68° C. 3 min for 40 cycles and 68° C.for 8 min, and with a final reaction volume of 50 μl.

[0394] Primers used for the PCR:

[0395] 5′ primer, identified by SEQ ID NO: 65 5′ GCCATCAAGC CACCCAAGAACTCTTAACTT 3′;

[0396] 3′ primer, identified by SEQ ID NO: 64 (=the same as for thecDNA)

[0397] A second, so-called “semi-nested” PCR was carried out with a 5′primer located within the region already amplified. This second PCR wasperformed under the same experimental conditions as those used in thefirst PCR, using 10 ml of the amplification product originating from thefirst PCR.

[0398] Primers used for the semi-nested PCR:

[0399] 5′ primer, identified by SEQ ID NO: 66 5′ CCAATAGCCA GACCATTATATACACTAATT 3′;

[0400] 3′ primer, identified by SEQ ID NO: 64 (=the same as for thecDNa)

[0401] Primers SEQ ID NO: 65 and SEQ ID NO: 66 are specific for the pol*region: position No. 403 to No. 422 and No. 641 to No. 670,respectively.

[0402] An amplification product was thus obtained from the extracellularRNA extracted from the plasma of a patient suffering from MS. Thecorresponding fragment was not observed for the plasma of the healthycontrol. This amplification product was cloned in the following manner.

[0403] The amplified DNA was inserted into a plasmid using the TACloning™ kit. The 2 μl of DNA solution were mixed with 5 μl of steriledistilled water, 1 μl of a 10-fold concentrated ligation buffer“10×LIGATION BUFFER”, 2 μl of “pCR™ VECTOR” (25 ng/μl) and 1 μl of “TADNA LIGASE”. This mixture was incubated overnight at 12° C. Thefollowing steps were carried out according to the instructions of the TACloning™ kit (British Biotechnology). At the end of the procedure, thewhite colonies of recombinant bacteria (white) were picked out in orderto be cultured and to permit extraction of the plasmids incorporatedaccording to the so-called “miniprep” procedure (17). The plasmidpreparation from each recombinant colony was cut with a suitablerestriction enzyme and analyzed on agarose gel. Plasmids possessing aninsert detected under UV light after staining the gel with ethidiumbromide was selected for sequencing of the insert, after hybridizationwith a primer complementary to the Sp6 promoter present on the cloningplasmid of the TA cloning kit™. The reaction prior to sequencing wasthen performed according to the method recommended for the use of thesequencing kit “Prism ready reaction kit dye deoxyterminator cyclesequencing kit” (Applied Biosystems, ref. 401384), and automaticsequencing was carried out with an Applied Biosystems “AutomaticSequencer, model 373 A” apparatus according to the manufacturer'sinstructions.

[0404] The clone obtained, designated FP6, enables a region of 467 bpwhich is 89% homologous to the pol* region of the MSRV-1 retrovirus anda region of 1167 bp which is 64% homologous to the pol region of ERV-9(No. 1634 to 2856) to be defined.

[0405] The clone FP6 is represented in FIG. 38 by its nucleotidesequence identified by SEQ ID NO: 57. The three potential reading framesof this clone are indicated by their amino acid sequence under thenucleotide sequence.

EXAMPLE 15 Obtaining a Region Designated G+E+A Containing an ORF for aRetroviral Protease, by PCR Amplification of the Nucleic Acid SequenceContained Between the 5′ Region Defined by the Clone “GM3” and the 3′Region Defined by the Clone POL*, from the RNA Extracted from a Pool ofPlasmas of Patients Suffering from MS

[0406] Oligonucleotides specific for the MSRV-1 sequences alreadyidentified by the Applicant were defined in order to amplify theretroviral RNA originating from virions present in the plasma ofpatients suffering from MS. Control reactions were performed so as tomonitor the presence of contaminants (reaction with water). Theamplification consists of a step of RT-PCR followed by a “nested” PCR.Pairs of primers were defined for amplifying three overlapping regions(designated G, E and A) on the regions defined by the sequences of theclones GM3 and pol* described above.

[0407] Semi-nested RT-PCR for amplification of the region G:

[0408] in the first RT-PCR cycle, the following primers are used:

[0409] primer 1: SEQ ID NO: 67 (sense)

[0410] primer 2: SEQ ID NO: 68 (antisense)

[0411] in the second PCR cycle, the following primers are used:

[0412] primer 1: SEQ ID NO: 69 (sense)

[0413] primer 4: SEQ ID NO: 70 (antisense)

[0414] Nested RT-PCR for amplification of the region E:

[0415] in the first RT-PCR cycle, the following primers are used:

[0416] primer 5: SEQ ID NO: 71 (sense)

[0417] primer 6: SEQ ID NO: 72 (antisense)

[0418] in the second PCR cycle, the following primers are used:

[0419] primer 7: SEQ ID NO: 73 (sense)

[0420] primer 8: SEQ ID NO: 72 (antisense)

[0421] Semi-nested RT-PCR for amplification of the region A:

[0422] in the first RT-PCR cycle, the following primers are used:

[0423] primer 9: SEQ ID NO: 74 (sense)

[0424] primer 10: SEQ ID NO: 75 (antisense)

[0425] in the second PCR cycle, the following primers are used:

[0426] primer 9: SEQ ID NO: 74 (sense)

[0427] primer 11: SEQ ID NO: 76 (antisense)

[0428] The primers and the regions G, E and A which they define arepositioned as follows:

[0429] The sequence of the region defined by the different clones G, Eand A was determined after cloning and sequencing of the “nested”amplification products.

[0430] The clones G, E and A were assembled together by PCR with theprimers 1 at the 5′ end of the fragment G and 11 at the 3′ end of thefragment A, the primers being described above. An approximately 1580-bpfragment G+E+A was amplified and inserted into a plasmid using the TACloning (trademark) kit. The sequence of the amplification productcorresponding to G+E+A was determined and analysis of the G+E and E+Aoverlaps was carried out. The sequence is shown in FIG. 39, andcorresponds to the sequence SEQ ID NO: 83.

[0431] A reading frame coding for an MSRV-1 retroviral protease wasfound in the region E. The amino acid sequence of the protease,identified by SEQ ID NO: 84, is presented in FIG. 40.

EXAMPLE 16 Obtaining a Clone LTRGAG12, Related to an EndogenousRetroviral Element (ERV) Close to MSRV-1, in the DNA of an MSLymphoblastoid Line Producing Virions and Expressing the MSRV-1Retrovirus

[0432] A nested PCR was performed on the DNA extracted from alymphoblastoid line (B lymphocytes immortalized with the EBV virusstrain B95, as described above and as is well known to a person skilledin the art) expressing the MSRV-1 retrovirus and originating fromperipheral blood lymphocytes of a patient suffering from MS.

[0433] In the first PCR step, the following primers are used: primer4327: CTCGATTTCT TGCTGGGCCT TA (SEQ ID NO:77) primer 3512: GTTGATTCCCTCCTCAAGCA (SEQ ID NO:78)

[0434] This step comprises 35 amplification cycles with the followingconditions: 1 min at 94° C., 1 min at 54° C. and 4 min at 72° C.

[0435] In the second PCR step, the following primers are used: primer4294: CTCTACCAAT CAGCATGTGG (SEQ ID NO:79) primer 3591: TGTTCCTCTTGGTCCCTAT (SEQ ID NO:80)

[0436] This step comprises 35 amplification cycles with the followingconditions: 1 min at 94° C., 1 min at 54° C. and 4 min at 72° C.

[0437] The products originating from the PCR were purified afterpurification on agarose gel according to conventional methods (17), andthen resuspended in 10 μl of distilled water. Since one of theproperties of Taq polymerase consists in adding an adenine at the 3′ endof each of the two DNA strands, the DNA obtained was inserted directlyinto a plasmid using the TA Cloning™ kit (British Biotechnology). The 2ml of DNA solution were mixed with 5 μl of sterile distilled water, 1 μlof a 10-fold concentrated ligation buffer “10×LIGATION BUFFER”, 2 μl of“pCR™ VECTOR” (25 ng/μl) and 1 μl of “TA DNA LIGASE”. This mixture wasincubated overnight at 12° C. The following steps were carried outaccording to the instructions of the TA Cloning™ kit (BritishBiotechnology). At the end of the procedure, the white colonies ofrecombinant bacteria were picked out in order to be cultured and topermit extraction of the plasmids incorporated according to theso-called “miniprep” procedure (17). The plasmid preparation from eachrecombinant colony was cut with a suitable restriction enzyme andanalyzed on agarose gel. The plasmids possessing an insert detectedunder UV light after staining the gel with ethidium bromide wereselected for sequencing of the insert, after hybridization with a primercomplementary to the Sp6 promoter present on the cloning plasmid of theTA Cloning Kit™. The reaction prior to sequencing was then performedaccording to the method recommended for the use of the sequencing kit“Prism ready reaction kit dye deoxyterminator cycle sequencing kit”(Applied Biosystems, ref. 401384), and automatic sequencing was carriedout with an Applied Biosystems “Automatic Sequencer, model 373 A”apparatus according to the manufacturer's instructions.

[0438] Thus, a clone designated LTRGAG12 could be obtained, and isrepresented by its internal sequence identified by SEQ ID NO: 56.

[0439] This clone is probably representative of endogenous elementsclose to ERV-9, present in human DNA, in particular in the DNA ofpatients suffering from MS, and capable of interfering with theexpression of the MSRV-1 retrovirus, hence capable of having a role inthe pathogenesis associated with the MSRV-1 retrovirus and capable ofserving as marker for a specific expression in the pathology inquestion.

EXAMPLE 17 Detection of ANTI-MSRV-1 Specific Antibodies in Human Serum

[0440] Identification of the sequence of the pol gene of the MSRV-1retrovirus and of an open reading frame of this gene enabled the aminoacid sequence SEQ ID NO: 63 of a region of the said gene, referenced SEQID NO: 58, to be determined.

[0441] Different synthetic peptides corresponding to fragments of theprotein sequence of MSRV-1 reverse transcriptase encoded by the pol genewere tested for their antigenic specificity with respect to sera ofpatients suffering from MS and of healthy controls.

[0442] The peptides were synthesized chemically by solid-phase synthesisaccording to the Merrifield technique (22). The practical details arethose described below.

[0443] a) Peptide synthesis:

[0444] The peptides were synthesized on a phenylacetamidomethyl(PAM)/polystyrene/divinylbenzene resin (Applied Biosystems, Inc. FosterCity, Calif.), using an “Applied Biosystems 430A” automatic synthesizer.The amino acids are coupled in the form of hydroxybenzotriazole (HOBT)esters. The amino acids used are obtained from Novabiochem(Läuflerlfingen, Switzerland) or Bachem (Bubendorf, Switzerland).

[0445] The chemical synthesis was performed using a double couplingprotocol with N-methylpyrrolidone (NMP) as solvent. The peptides werecut from the resin, as well as the side-chain protective groups,simultaneously, using hydrofluoric acid (HF) in a suitable apparatus(type I cleavage apparatus, Peptide Institute, Osaka, Japan).

[0446] For 1 g of peptidyl resin, 10 ml of HF, 1 ml of anisole and 1 mlof dimethyl sulphide 5DMS are used. The mixture is stirred for 45minutes at −2° C. The HF is then evaporated off under vacuum. Afterintensive washes with ether, the peptide is eluted from the resin with10% acetic acid and then lyophilized.

[0447] The peptides are purified by preparative high performance liquidchromatography on a VYDAC C18 type column (250×21 mm) (The SeparationGroup, Hesperia, Calif., USA). Elution is carried out with anacetonitrile gradient at a flow rate of 22 ml/min. The fractionscollected are monitored by an elution under isocratic conditions on aVYDAC™ C18 analytical column (250×4.6 mm) at a flow rate of 1 ml/min.Fractions having the same retention time are pooled and lyophilized. Thepreponderant fraction is then analyzed by analytical high performanceliquid chromatography with the system described above. The peptide whichis considered to be of acceptable purity manifests itself in a singlepeak representing not less than 95% of the chromatogram.

[0448] The purified peptides are then analyzed with the object ofmonitoring their amino acid composition, using an Applied Biosystems420H automatic amino acid analyzer. Measurement of the (average)chemical molecular mass of the peptides is obtained using LSIMS massspectrometry in the positive ion mode on a VG. ZAB.ZSEQ double focusinginstrument connected to a DEC-VAX 2000 acquisition system (VG analyticalLtd, Manchester, England).

[0449] The reactivity of the different peptides was tested against seraof patients suffering from MS and against sera of healthy controls. Thisenabled a peptide designated S24Q to be selected, whose sequence isidentified by SEQ ID NO: 59, encoded by a nucleotide sequence of the polgene of MSRV-1 (SEQ ID NO: 58).

[0450] b) Antigenic properties:

[0451] The antigenic properties of the S24Q peptide were demonstratedaccording to the ELISA protocol described below.

[0452] The lyophilized S24Q peptide was dissolved in 10% acetic acid ata concentration of 1 mg/ml. This stock solution was aliquoted and keptat +4° C. for use over a fortnight, or frozen at −20° C. for use within2 months. An aliquot is diluted in PBS (phosphate buffered saline)solution so as to obtain a final peptide concentration of 5micrograms/ml. 100 microliters of this dilution are placed in each wellof Nunc Maxisorb (trade name) microtitration plates. The plates arecovered with a “plate-sealer” type adhesive and kept for 2 hours at +37°C. for the phase of adsorption of the peptide to the plastic. Theadhesive is removed and the plates are washed three times with a volumeof 300 microliters of a solution A (1×′ PBS, 0.05% Tween 20r), theninverted over an absorbent tissue. The plates thus drained are filledwith 250 microliters per well of a solution B (solution A+10% of goatserum), then covered with an adhesive and incubated for 1 hour at 37° C.The plates are then washed three times with the solution A as describedabove.

[0453] The test serum samples are diluted beforehand to 1/100 in thesolution B, and 100 microliters of each dilute test serum are placed inthe wells of each microtitration plate. A negative control is placed inone well of each plate, in the form of 100 microliters of buffer B. Theplates covered with an adhesive are then incubated for 1 hour 30 min at37° C. The plates are then washed three times with the solution A asdescribed above. For the IgG response, a peroxidase-labelled goatantibody directed against human IgG (marketed by Jackson Immuno ResearchInc.) is diluted in the solution B (dilution 1/10,000). 100 microlitersof the appropriate dilution of the labelled antibody are then placed ineach well of the microtitration plates, and the plates covered with anadhesive are incubated for 1 hour at 37° C. A further washing of theplates is then performed as described above. In parallel, the peroxidasesubstrate is prepared according to the directions of the bioMérieuxkits. 100 microliters of substrate solution are placed in each well, andthe plates are placed protected from light for 20 to 30 minutes at roomtemperature.

[0454] When the color reaction has stabilized, 50 microliters of Color 2(bioMérieux trade name) are placed in each well in order to stop thereaction. The plates are placed immediately in an ELISA platespectrophotometric reader, and the optical density (OD) of each well isread at a wavelength of 492 nm.

[0455] The serological samples are introduced in duplicate or intriplicate, and the optical density (OD) corresponding to the serumtested is calculated by taking the mean of the OD values obtained forthe same sample at the same dilution.

[0456] The net OD of each serum corresponds to the mean OD of the serumminus the mean OD of the negative control (solution B: PBS, 0.05% Tween20×, 10% goat serum).

[0457] c) Detection of anti-MSRV-1 IgG antibodies (S24Q) by ELISA:

[0458] The technique described above was used with the S24Q peptide totest for the presence of anti-MSRV-1 specific IgG antibodies in theserum of 15 patients for whom a definite diagnosis of MS was establishedaccording to the criteria of Poser (23), and of 15 healthy controls(blood donors).

[0459]FIG. 41 shows the results for each serum tested with an anti-IgGantibody. Each vertical bar represents the net optical density (OD at.492 nm) of a serum tested. The ordinate axis gives the net OD at the topof the vertical bars. The first 15 vertical bars lying to the left ofthe vertical broken line represent the sera of 15 healthy controls(blood donors), and the 15 vertical bars lying to the right of thevertical broken line represent the sera of 15 cases of MS tested. Thediagram enables 2 controls to be revealed whose OD rises above thegrouped values of the control population. These values may represent thepresence of specific IgGs in symptomless seropositive patients. Twomethods were hence evaluated in order to determine the statisticalthreshold of positivity of the test.

[0460] The mean of the net OD values for the controls, including thecontrols with high net OD values, is 0.129 and the standard deviation is0.06. Without the 2 controls whose OD values are greater than 0.2, themean of the “negative” controls is 0.107 and the standard deviation is0.03. A theoretical threshold of positivity may be calculated accordingto the formula:

threshold value (mean of the net OD values of the negative controls)+(2or 3′ standard deviation of the net OD values of the negative controls).

[0461] In the first case, there are considered to be symptomlessseropositives, and the threshold value is equal to 0.11+(3×0.03)=0.20.The negative results represent a non-specific “background” of thepresence of antibodies directed specifically against an epitope of thepeptide.

[0462] In the second case, if the set of controls consisting of blooddonors in apparent good health is taken as a reference basis, withoutexcluding the sera which are, on the face of it, seropositive, thestandard deviation of the “non-MS controls” is 0.116. The thresholdvalue then becomes 0.13+(3×0.06)=0.31.

[0463] According to this latter analysis, the test is specific for MS.In this respect, it is seen that the test is specific for MS, since, asshown in Table 1, no control has a net OD above this threshold. In fact,this result reflects the fact that the antibody titers in patientssuffering from MS are, for the most part, higher than in healthycontrols who have been in contact with MSRV-1.

[0464] In accordance with the first method of calculation, and as shownin FIG. 41 and in Table 3, 6 of the 15 MS sera give a positive result(OD greater than or equal to 0.2), indicating the presence of IgGsspecifically directed against the S24Q peptide, hence against a portionof the reverse transcriptase enzyme of the MSRV-1 retrovirus encoded byits pol gene, and consequently against the MSRV-1 retrovirus.

[0465] Thus, approximately 40% of the MS patients tested have reactedagainst an epitope carried by the S24Q peptide and possess circulatingIgGs directed against the latter.

[0466] Two out of 15 blood donors in apparent good health show apositive result. Thus, it is apparent that approximately 13% of thesymptomless population may have been in contact with an epitope carriedby the S24Q peptide under conditions which have led to an activeimmunization which manifests itself in the persistence of specific serumIgGs. These conditions are compatible with an immunization against theMSRV-1 retrovirus reverse transcriptase during an infection with (and/orreactivation of) the MSRV-1 retrovirus. The absence of apparentneurological pathology recalling MS in these seropositive controls mayindicate that they are healthy carriers and have eliminated aninfectious virus after immunizing themselves, or that they constitute anat-risk population of chronic carriers. In effect, epidemiological datashowing that a pathogenic agent present in the environment of regions ofhigh prevalence of MS may be the cause of this disease imply that afraction of the population free from MS has necessarily been in contactwith such a pathogenic agent. It has been shown that the MSRV-1retrovirus constitutes all or part of this “pathogenic agent” at thesource of MS, and it is hence normal for controls taken from a healthypopulation to possess IgG type antibodies against components of theMSRV-1 retrovirus.

[0467] Lastly, the detection of anti-S24Q antibodies in only one out oftwo MS cases tested here may reflect the fact that this peptide does notrepresent an immunodominant MSRV-1 epitope, that inter-individual strainvariations may induce an immunization against a divergent peptide motifin the same region, or that the course of the disease and the treatmentsfollowed may modulate over time the antibody response against the S24Qpeptide. TABLE No. 3 CONTROLS MS 0.101 0.136 0.058 0.391 0.126 0.37 0.131 0.119 0.105 0.267 0.294 0.141 0.116 0.102 0.088 0.18  0.105 0.4110.172 0.164 0.137 0.049 0.223 0.644 0.08  0.268 0.073 0.065 0.132 0.074Mean 0.129 Std. Dev. 0.06  Threshold 0.31 

[0468] d) Detection of anti-MSRV-1 IgM antibodies by ELISA:

[0469] The ELISA technique with the S24Q peptide was used to test forthe presence of anti-MSRV-1 IgM specific antibodies in the same sera asabove.

[0470]FIG. 42 shows the results for each serum tested with an anti-IgMantibody. Each vertical bar represents the net optical density (OD at492 nm) of a serum tested. The ordinate axis gives the net OD at the topof the vertical bars. The first 15 vertical bars lying to the left ofthe vertical line cutting the abscissa axis represent the sera of 15healthy controls (blood donors), and the vertical bars lying to theright of the vertical broken line represent the sera of 15 cases of MStested.

[0471] The mean of the OD values for the MS cases tested is 1.6.

[0472] The mean of the net OD values for the controls is 0.7.

[0473] The standard deviation of the negative controls is 0.6.

[0474] The threshold of theoretical positivity may be calculatedaccording to the formula:

threshold value=(mean of the OD values of the negativecontrols)+(3×standard deviation of the OD values of the negativecontrols)

[0475] The threshold value is hence equal to 0.7+(3×0.6)=2.5;

[0476] The negative results represent a non-specific “background” of thepresence of antibodies directed specifically against an epitope of thepeptide.

[0477] According to this analysis, and as shown in FIG. 42 and in thecorresponding Table 4, the IgM test is specific for MS, since no controlhas a net OD above the threshold. Six of the 15 MS sera produce apositive IgM result.

[0478] The difference in seroprevalence between the MS and controlpopulations is extremely significant: “chi-squared” test, p<0.002.

[0479] These results point to an aetiopathogenic role of MSRV-1 in MS.

[0480] Thus, the detection of IgM and IgG antibodies against the S24Qpeptide makes it possible to evaluate, alone or in combination withother MSRV-1 peptides, the course of an MSRV-1 infection and/or of theviral reactivation of MSRV-1. TABLE No. 4 CONTROLS MS 0.449 0.974 0.3716.117 0.448 2.883 0.456 1.945 0.885 1.787 2.235 0.273 0.301 1.766 0.1380.668 0.16  2.603 1.073 0.802 1.366 0.245 0.283 0.147 0.262 2.441 0.5850.287 0.356 0.589 Mean 0.7  Std. Dev. 0.6  Threshold Value 2.5 

[0481] It is possible, as a result of the new discoveries made and thenew methods developed by the inventors, to permit the improvedimplementation of diagnostic tests for MSRV-1 infection and/orreactivation and to evaluate a therapy in MS and/or RA on the basis ofits efficacy in “negativing” the detection of these agents in thepatient's biological fluids. Furthermore, early detection in individualsnot yet displaying neurological signs of MS or rheumatological signs ofRA could make it possible to institute a treatment which would be allthe more effective with respect to the subsequent clinical course forthe fact that it would precede the lesion stage which corresponds to theonset of the clinical disorders. Now, at the present time, a diagnosisof MS or RA cannot be established before a symptomatology of lesions hasset in, and hence no treatment is instituted before the emergence of aclinical picture suggestive of lesions which are already significant.The diagnosis of an MSRV-1 and/or MSRV-2 infection and/or reactivationin man is hence of decisive importance, and the present inventionprovides the means of doing this.

[0482] It is thus possible, apart from carrying out a diagnosis ofMSRV-1 infection and/or reactivation, to evaluate a therapy in MS on thebasis of its efficacy in “negativing” the detection of these agents inthe patients' biological fluids.

EXAMPLE 18

[0483] 1) Materials and Methods

[0484] Patients and Clinical Samples

[0485] Choroid plexus cells from MS patients and controls were obtainedfrom the brain-cell library, Laboratoire R. Escourolles, Hôpital de laSalpêtriére, Paris, France. Non-tumoral leptomeningeal cells fromcontrols were obtained as previously described (26). Peripheral bloodfrom MS and control patients used for obtaining B-cell lines and plasma,were obtained from the Neurological Departments, CHU de Grenoble, andfrom INSERM U 134, Hôpital de la Salpêtriére, France. Clinical detailsand origin of the 10 MS patients and of the 10 patients with otherneurological diseases who provided CSF samples are given in Table 6.

[0486] Cell Cultures, Virus Isolation and Purification

[0487] All cell-types were cultured as previously described (3, 5, 26).All cultures were regularly screened for mycoplasma contamination withan ELISA mycoplasma-detection kit (Boehringer). No cell-extract norsupernatant used contained detectable mycoplasma.

[0488] Extracellular virion purification and sucrose density gradientswere performed as previously described (3, 5, 26). From each sucrosegradient 0.5-1 ml fractions were collected from the top of the tubes,with a 1000 μl Pipetman and a different sterile tip for each fraction.60 μl were used for RT activity assay and the rest was mixed with 1volume of buffer containing 4M guanidinium thiocyanate, 0.5% N-Lauroylsarcosin, 25 mM EDTA, 0.2% B3-mercaptoethanol adjusted at pH 5.5 withacetic acid. These mixtures were frozen at −80° C. for further RNAextraction or directly processed according to Chomzynski (20), with anovernight precipitation step at −20° C., in the presence of RNase-freeglycogen (Boehringer). RNA was dissolved in 20 to 50 μl of DEPC-treatedwater in the presence of 1-21 μl of recombinant RNase-inhibitor(PROMEGA) and 0,1 mM DTT. 10 μl aliquots were used for each RT-PCR.

[0489] Reverse Transcriptase Activity

[0490] RT-activity was tested with 20 mM Mg⁺⁺ and poly-Cm or polyCtemplates, in virion pellets or fractions from sucrose gradients aspreviously described (3, 5, 26).

[0491] cDNA Synthesis and ‘Pan-retro’ RT-PCR with Degenerate Primers

[0492] A total RT-activity between 10⁶-10⁷ dpm was required in thefraction containing the peak of purified virions. The “Pan-retro” RT-PCRtechnique (27) was performed on virion RNA extracted by the method ofChomczynski (20) and dissolved in 20 μl RNase-free water. 5 μl RNAsolution was incubated for 30 min at 37° C. with 0.3 units (3 units forCSF series) of RNase-free DNase-1 (Boehringer) in a 20 ml reactioncontaining 7.5 mM random hexamers, 5 mM Hepes-HCl pH 6.9, 75 mM KCl, 3mM MgCl₂, 10 mM DTT, 50 mM Tris-HCl pH 7.5, 0.5 mM each dNTP, and 20units recombinant RNase inhibitor (Promega). The DNase was then heatinactivated at 80° C. for 10 min. 20 units MoMLV RT (Phannacia) and afurther 20 units of RNase inhibitor were added to each tube in aGenesphere™ enclosure (Safetech, Ireland) and cDNA was synthesised for90 min at 37° C. Following reverse transcription, the cDNA was boiledfor 5 min then cooled rapidly on ice. The Round 1 PCR mix (final volume25 μl per reaction; 20 mM Tris-HCl pH 8.4, 60 mM KCl, 2.5 mM MgCl₂, 200ng each of primers Pan-UO and Pan-DI [see FIG. 44], 0.2 mM each dNTP)was treated with 0.3 units DNase-1 and then heat inactivated as above.2.5 μl cDNA was added in the Genesphere™ enclosure and the tubes heatedto 80° C. before adding 0.5 units Taq polymerase (Perkin Elmer)individually to each tube (“hot start”). Round 1 PCR parameters were 35cycles of 95° C. for 1 min, 34° C. for 30 sec, 72° C. for 1 min, with afinal 7 min extension at 72° C. 0.5 ml of Round 1 PCR product wastransferred to the Round 2 DNase-treated PCR mix (composition as forRound 1 but containing primers Pan-UI and Pan-DI) using the “hot start”procedure. Round 2 PCR parameters were as for Round 1 but using 30cycles only and annealing at 45° C. for 1 min.

[0493] Cloning of PCR Products

[0494] PCR products were cloned using the TA-cloning™ kit (BritishBiotechnology) according to the manufacturer's recommendations.

[0495] Sequencing

[0496] Sequencing reactions were performed using the “Prism readyreaction kit dye deoxyterminator cycle sequencing kit” (AppliedBiosystems). Automatic sequence analysis was performed on an automaticsequencer (Applied Biosystems, 373 A).

[0497] RT-PCR with STI Primer Sets

[0498] The first PCR round was performed directly from the cDNA reactionmixture according to the one-step RT-PCR technique described by Malletet al. (28). This one-step RT-PCR procedure reduced the probability ofairborne contamination when opening the tubes and transferring PCRreagents after an independent cDNA synthesis. RNA was extracted aspreviously from 2 ml of plasma (snap-frozen in liquid nitrogen andstored at −80° C.) or from a 500 ml sucrose fraction with a totalRT-activity above 10⁶ dpm, and resuspended in 50 μl of RNase-free water.For each RT-PCR reaction 10 μl of RNA solution was incubated in aPerkin-Elmer 480 thermocycler, 15 min at 20° C. with 1U of RNase-freeDNASE 1 and 1.2 μl of 10×DNASE buffer (50 mM Tris, 10 mM MgCl2 and 0,1mM DTT) containing 1U/ml of RNase-inhibitor (PROMEGA), and heated at 70°C. for 10 min for DNase inactivation. The solution was placed on ice andmixed (in conditions preventing airborne dust/DNA contamination) with 88μl of PCR mix containing: 1×taq buffer, 25 nM/tube dNTPs, 40 pM/tube ofeach first round primer (ST1.1 upstream primer:

[0499] 5′ AGGAGTAAGGAAACCCAACGGAC 3′ (SEQ ID NO: 15); ST1.1 downstreamprimer: 5′TAAGAGTTGCACAAGTGCG 3′ (SEQ ID NO: 16)), 2.5U/tube of taq(Appligene) and 10U/tube of AMV-RT (Boehringer). Each tube was furtherincubated in a Perkin-Elmer 480 thermocycler for 10 min at 65° C.,followed by 2h at 42° C. for cDNA synthesis and 5 min at 95° C. forinactivation of AMV-RT and DNA denaturation. First round parameters were40 cycles of 95° C. for 1 min, 53° C. for 2.5 min, 72° C. for 1 min,with a final extension of 10 min at 72° C. 10 μl of the first round weretransferred to the second round PCR mix previously treated at 20° C. for15 min with RNase-free DNase 1 (0.02U/ml) followed by DNase inactivationat 70° C. for 10 min. This mix contained 1×taq buffer, 25 nM/tube dNTPs,40 pM/tube of each second round primers [ST1.2 upstream primer:5′TCAGGGATAGCCCCCATCTAT3′ (SEQ ID NO: 17); ST1.2 downstream primer:5′AACCCTTTGCCACTACATCAATTT3′ (SEQ ID NO: 18)] and 2.5U/tube of taq(Appligene). Second round parameters were 30 cycles of 95° C. for 1 min,53° C. for 1.5 min, 72° C. for 1 min, with a final extension of 8 min at72° C. 20 ml of this nested RT-PCR product were deposited on a 0,7%agarose gel containing ethidium bromide and exposed to UV light for thevisualization of amplified products.

[0500] Hybridisation Analysis of PCR Products: MSRV-pol Detection byELOSA

[0501] The protocol was essentially as previously described (21) butwith the following modifications: Nunc Maxisorb microtiter plates werecoated with 100 ng per well capture probe CpV1b (see FIG. 44) either bypassive adsorption (21) or alternatively by using streptavidin coatedplates and biotinylated CpV1b. Peroxidase-labelled detector probe DpV1(see FIG. 44) was used and the assay cut-off was defined as the mean of4 negative controls plus 0.2 OD₄₉₂ units.

[0502] RNA Extraction, cDNA Synthesis and PCR Amplification from MSPlasma Samples:

[0503] Total RNA was extracted from human MS plasma by a guanidiummethod as described elsewhere (29). Total RNA extracted from 100 ul ofplasma, were treated with RNase-free DNase 1 (0.1U/ml; BoehringerManheim, France) and reverse transcribed under the conditionsrecommended by the manufacturer, using Superscript reverse transcriptase(Gibco-BRL, FRANCE). The resulting cDNAs were amplified by semi-nestedPCR through 35 cycles (94° C. 1 min, 55° C. 1 nm, 72° C. 1 min 30 sec)and 72° C. 8 min for a final extension. Three different fragments in theRT region were amplified by the following specific primers:

[0504] in the protease (PRT) region, for the 1st and 2nd round of PCR,respectively, sense primer [5′ TCC AGC AGC AGG ACT GAG GGT 3′ (SEQ IDNO: 93)] and antisense primers [5′ CTG TCC GTT GGG TTT CCT TAC TCC T 3′(SEQ ID NO: 72)/5′GAC AGC AAA TGG GTA TTC CTT TCC 3′ (SEQ ID NO: 94)]

[0505] in the fragment A of the RT region (Cf. FIG. 46), for the 1st and2nd round of PCR, respectively, sense primer [5′ AGG AGT AAG GAA ACC CAACGG ACA G 3′ (SEQ ID NO: 95)] and antisense primers [5′ TGT ATA TAA TGGTCT GGC TAT TGG G 3′ (SEQ ID NO: 96)/5′ TTC GGC AGA AAC CTG TTA TGC CAAGG 3′ (SEQ ID NO: 76)]

[0506] in the fragment B of the RT region (Cf. FIG. 46), for the 1st and2nd round of PCR, respectively, sense primers [5′ GGC TCT GCT CAC AGGAGA TTA GAT AC 3′ (SEQ ID NO: 97)/5′ AAA GGC ACC AGG GCC CTC AGT GAG GA3′ (SEQ ID NO: 98)] and antisense primer 3′[5′ GGT TTA AGA GTT GCA CAAGTG CGC AGT C 3′ (SEQ ID NO: 99)].

[0507] The amplified fragments were analyzed on ethidium bromide-stainedagarose gels, cloned in the TA cloning vector (Invitrogen) andsequenced.

[0508] 2) Results

[0509] Specific Retroviral RNA is Found in Extracellular Virions from MSPatient-Derived Cell Cultures and in MS Patients' CSF.

[0510] Choroid plexus cells (4) (obtained post-mortem) andEBV-immortalized peripheral blood B-lymphocytes (30, 31) from MSpatients gave rise to cultures expressing 100-120 nm viral particlesassociated with RT-activity similar to that of the original LM7 isolate(3). Similar cell-types from non-MS donors produced neither thisRT-activity nor virions. All the ‘infected’ cultures were poorly and/ortransiently productive and/or had a limited lifespan. Therefore, inorder to analyze the genomic RNA present in the very limited quantity ofextracellular virions, we used an RT-PCR approach to amplify, withdegenerate primers, a conserved region of the pol gene present in allknown retroviruses (12); the techniques based on this approach will becalled “Pan-retro” RT-PCR. Extensive DNase treatment of samples andreagents was essential, because human DNA contains many endogenousretroviral elements amplifiable by this technique. “Pan-retro” RT-PCRexperiments were performed on sucrose-density gradient purified virionsfrom supernatants of different types of cell cultures and theirnon-infected controls: (i) choroid plexus cells sampled post-mortem fromMS brain (PLI-1), (ii) choroid plexus cells from non-MS brain autopsy,infected by co-culture with irradiated LM7 cells (LM7P), and (iii)identical non-infected choroid-plexus cells. “Early” B-cell linesobtained by spontaneous in vitro transformation of two EBV-seropositiveindividuals, (iv) one MS patient and (v) one non-MS control, were alsoanalysed. FIG. 43 illustrates the RT-activity in sucrose-gradientfractions obtained from the B-cell cultures. The technique described byShih et al. (12) was modified in a semi-nested RT-PCR protocol (27)using degenerate primers (FIG. 2) and extensive DNase treatment. PCRamplifications were performed in London (Dpt of Virology, U.C.L.M.S.) oncoded aliquots of the density gradient fractions. Blind and systematiccloning and sequencing of the PCR products were undertaken in anindependent laboratory (bioMérieux, Lyon). After complete sequencing of20 to 30 clones per sucrose gradient fraction, the codes were broken andresults analysed in parallel with the RT-activity data. TABLE 5SEQUENCES GENERATED BY ‘PAN-RETROVIRUS’ PCR OF DENSITY GRADIENTFRACTIONS (containing the peak of RT-activity or the correspondingcontrol fraction) MSRV PCR Total CULTURE c · pol ERV9^((v))artefacts^((vI)) clones LM7P (I) 16 4  6 26 PLI-1 (II)  9 1 13 23 MSB-CELL LINE (III)  9 2  8 19 CONTROL B-CELL  0 0 26 28 LINE (Iv)

[0511] Table 5 presents the distribution of sequences obtained fromsucrose gradient fractions containing the peak of viral RT-activity inMS-derived cultures and also the sequences amplified from thecorresponding RT-activity negative fractions of uninfected cultures. Thepredominant sequence detected in bands of the expected size (≅140 bp)amplified in all the RT-activity positive fractions (but not in theRT-activity negative fractions) was different from known retrovirusesand was designated MSRV-cpol. MSRV-cpol sequences exhibited partialhomology (70-75%) with ERV9, a previously described endogenousretroviral sequence (18). A few ERV9 sequences (>90% homology with ERV9)were also present but clearly represented a minority of clones. Inaddition to typical pol sequences, numerous PCR artifacts (primermultimers, concatemers or single-primer amplifications) related to theuse of degenerate primers and low-temperature annealing, were found inall samples (Table 5).

[0512]FIG. 44 shows an alignment of a consensus sequence of MSRV-cpolwith the corresponding VLPQG/YMDD region of diverse retroviruses. FIG.45 displays a phylogenic tree based on the evolutionarily conservedamino acid sequences of both exogenous and endogenous retroviruses inthis region. From this tree it can be seen that the pol gene of MSRV isphylogenically related to the C-type group of oncovirinae.

[0513] A small scale study was performed to determine the prevalence ofMSRV c-pol sequences in the CSF of patients with MS. Identification ofMSRV-cpol in PCR products by cloning and sequencing is both laboriousand time consuming. We therefore devised an enzyme-linked oligosorbentassay (ELOSA), using a capture probe (CpV1B) and a peroxidase-labelleddetector probe (DpV1), for the rapid identification of MSRV-cpolsequences in ‘Pan-retrovirus’ PCR products (FIG. 44). The specificity ofthis sandwich hybridization-based assay for HMSRV-cpol was tested withboth distantly related (HIV and MoMLV) and closely related (ERV9) polsequences. No significant cross reactivity with such targets wasobserved despite the ability of the ELOSA to detect as little as 0.01 ngof MSRV-cpol DNA. TABLE 6 DETECTION OF HMSRV IN THE CSF OF PATIENTS WITHMULTIPLE SCLEROSIS AND OTHER NEUROLOGICAL DISEASES MS Treatment atPatient¹ Age/Sex Diagnosis MS Type MS Activity MS Duration sampling MSRVELOSA ITMS1 27 yrs/M multiple 2

progressive slow  5 yrs corticosteroids negative sclerosis progressionITMS2 55 yrs/M multiple 1

progressive slow  9 yrs none POSITIVE sclerosis progression ITMS3 51yrs/F multiple 1

progressive slow  2 yrs none negative sclerosis progression ITMS4 22yrs/F multiple relapsing/ In remission  8 yrs none POSITIVE sclerosisremitting ITMS5 27 yrs/F multiple 1

progressive slow  8 yrs cyclophosphamide negative sclerosis progressionITMS6 33 yrs/M multiple 2

progressive slow 16 yrs none (previously negative sclerosis progressioncycloph. + corticost.) ITMS7 33 yrs/F multiple 2

progressive slow  9 yrs none POSITIVE sclerosis progression ITMS8 25yrs/F multiple relapsing/ stable  3 yrs none POSITIVE sclerosisremitting ITMS9 36 yrs/F multiple 2

progressive slow  3 yrs none POSITIVE sclerosis progression ITMS10 36yrs/M muitiple 2

progressive slow  7 yrs corticosteroids negative sclerosis progressionOND1 37 yrs/F cerebellar NA² NA NA NA negative atrophy OND2 26 yrs/Fviral NA NA NA NA negative myeiltis OND3 38 yrs/F viral encephalitis NANA NA NA negative OND4 28 yrs/F viral encephalitis NA NA NA NA negativeOND5 64 yrs/M viral encephalitis NA NA NA NA negative OND6 32 yrs/MGuillain - Barré NA NA NA NA negative OND7 54 yrs/F cerebrovascular NANA NA NA negative OND8 52 yrs/F hydrocephalus NA NA NA NA negative OND925 yrs/F 1

cerebral NA NA NA NA negative tumour OND10 21 yrs/M epilepsy NA NA NA NAnegative

[0514] Cerebrospinal fluid (CSF) samples were available from 10 patientswith MS and from 10 patients with other neurological disorders. TotalRNA was extracted from CSF pellets, reverse transcribed and amplified asabove. ELOSA analysis (Table 6) of the PCR products revealed MSRV-cpolsequences in 5 of the 10 MS patient samples but in none of the 10samples from patients with other neurological diseases (P<0.05). Thepresence of MSRV-cpol did not appear to be correlated with age, sex ortype of MS, but was seen in untreated patients only (5/6). No patientwith immunosuppressive therapy was found positive (0/4). No correlationbetween MSRV-cpol detection and CSF cell count was observed.

[0515] Cloning and Sequencing a Larger Region of the pol Gene

[0516] An independent identification of the MSRV genomic sequence wasobtained by a non-PCR approach using RNA extracted from concentratedvirions derived from 2.5 liters of LM7-infected sub-cultures of choroidplexus cells. A limited number of clones was obtained by direct cloningof the cDNA, one of which (PSJ17) showed partial homology with ERV9 pol.Specific primers based on the MSRV-cpol region and on the PSJ17 clone,amplified a 740 bp fragment linking the two independent sequences in RNAextracted from purified virions. PSJ1 7 was localized on the 3′ side ofMSRV-cpol. Further sequence extension on the 5′ side of MSRV-cpol and onthe 3′ side of PSJ17, was obtained using RT-PCR approaches on RNA frompurified LM7-like virions produced in MS choroid plexus cultures (4).

[0517] In FIG. 46, the nucleotide sequence corresponding to overlappingclones obtained by sequence extension in the pol gene is representedwith the amino acid translation corresponding to the putative openreading frames (ORFs) of the protease and of the reverse-transcriptase.The active site motifs of the protease (PRT) and of thereverse-transcriptase (RT) are underlined. In the C-terminal region ofthe RT sequence, the dispersed amino acid residues regularly present inretroviral RNase H domains, are also underlined.

[0518] Non-Degenerate Primers Detect MSRV-Specific RNA in VirionsAssociated with the Peak of RT-Activity and in MS Patients' Plasma

[0519] PCR primers (ST1.1 primer set; positions 603-625/1732-1714, onFIG. 4) based on overlapping clones in the pol gene, amplified a 1.15 kbsegment of the RT region from several different isolates obtained fromdifferent MS patients. Nested primers (ST1.2; positions869-889/1513-1490, on FIG. 46) generated a 700 bp fragment (FIG. 47)which was more easily visualized by ethidium bromide staining than thefirst round product generated by ST1.1. The specificity of PCR productswas confirmed by stringent hybridization with a peroxidase-labeledMSRV-cpol probe (FIG. 44), using the ELOSA technique (21).

[0520] The ST1.1 and 2 primer set was used to detect extracellular MSRVRNA in human plasma, although non-optimal for this application. FIG. 47illustrates the results of PCR amplification of cDNA derived from 2 MSpatient and 2 control plasma samples tested in parallel with cDNA fromthe sucrose density gradient fractions of an MS choroid plexus isolate.Taq-sequencing of the 700 bp bands confirmed the presence of MSRVsequence. A very faint 700 bp band is also visible in fraction 10 whichcorresponds to the bottom of the tube where aggregated particles usuallysediment. Control RT-PCR for cellular aldolase transcripts onplasma-derived RNA was negative, indicating that the results were notdue to cellular RNA released by cell lysis during plasma separation. Itshould be noted that this PCR technique was not designed forepidemiological studies since its sensitivity is impaired by the lengthof the cDNA required (1.15 kb).

[0521] Non-degenerate primers amplifying three fragments of the pol gene(the whole protease region, regions A and B of the reversetranscriptase; Cf. FIG. 46) were also used to confirm the presence ofMSRV sequences in DNase-treated RNA from MS plasma. These fragments wereamplified from the plasma of a further 4 MS patients with activedisease. Sequence analysis confirmed that the PRT and RT regions werehomologous (>95% and >90% respectively) to MSRV sequences previouslyobtained on culture virion. No such sequence were detected in plasmafrom healthy controls (n=4), tested in parallel with MS plasma.

[0522] 3) Discussion

[0523] Phylogeny of MSRV

[0524] From the results of this study, it can be concluded that thevirus previously referred to as “LM7” (3, 5, 26) possesses an RNA genomecontaining the MSRV pol sequences described here. The conserved RT motifof both MSRV and ERV9 is two amino acids shorter than that of otherretroviruses, apart from human foamy viruses which nonetheless have afunctional RT. The potential ORF encompassing the entire PRT-RT regionis consistent with the virion-associated RT-activity detected in sucrosedensity gradients with infected culture supernatants. Moreover, since wehave recently succeeded in expressing a recombinant protein from thesequence of MSRV protease cloned from MS plasma, we can confirm thereality of the potential PRT ORF. Similar cloning and expression ofother sequences containing potential ORFs for MSRV proteins, is beingundertaken to confirm their ability to encode enzymes and structuralproteins of MSRV virions. The phylogenic tree in FIG. 45, based on themost conserved amino acid sequence in retroviruses (VLPQG . . . YXDD),shows that the MSRV-pol gene is related to the C-type oncoviruses. Apartfrom ERV9, the closest known retroviral element is RTLV-H, a humanendogenous sequence known to have a subtype with a functional pol gene(32). In the pol region, this phylogenic affiliation to C-typeoncoviruses apparently contradicts our previous assumptions based on thegeneral morphology of the particles observed by electron microscopy(EM), which were compatible with a B or D-type oncovirus (3, 5, 26).However, preliminary data on env sequences detected in MSRV virions,would suggest a greater phylogenic proximity to D-type. Such differencein phylogenies of the pol and env genes have been described in MPMV andsuggest a recombinatorial origin in D-type retroviruses (33). D to Ctype morphological conversion is also possible since it has beenreported that a single amino acid substitution in the gag protein canconvert retrovirus morphology to that of a different type (34).

[0525] Is MSRV an Exogenous Retrovirus Sharing Extensive Homology with aRelated Endogenous Retrovirus Family or an Endogenous RetrovirusProducing Extracellular Virions?

[0526] Southern blot analysis with an MSRV pol probe under stringentconditions, showed hybridisation with a multicopy endogenous family(data not presented), indicating the existence of endogenous elementsmore closely related to MSRV than ERV9 itself. Consequently, we wereunable to look for a virion-specific provirus in MSRV-producing cells.In agreement with southern blot findings, PCR studies on genomic DNAshowed multiple band amplification of MSRV-related endogenous sequences.Since pol is the most conserved retroviral gene, the sequence describedhere is the least suitable region to discriminate between exogenous andendogenous sequences. It is hoped that sequence information from otherparts of the genome may permit such a discrimination, would it be on atiny portion as has recently been demonstrated for the Jaagsiekteretrovirus (JSRV) of sheep (35). With such sequence data, it would thenbecome possible to identify the MSRV-specific provirus in the genome ofvirion-producing cell cultures.

[0527] MSRV could represent a virion-producing exogenous member of anERV9-like endogenous family, just as exogenous strains exist in thewell-studied mouse mammary tumour virus (MMTV) and murine leukaemiavirus (MuLV) retroviral families of mice, and also, in the JSRVretroviral family of sheep (36). Alternatively, it is also conceivablethat the extracellular MSRV virions may be produced by areplication-competent endogenous provirus. Whether MSRV is exogenous orendogenous, conceptual similarities exist with the category ofretroviruses represented by MuLV, MMTV and JSRV. Unlike defectiveendogenous elements, this category of agents are known to produceinfectious and pathogenic virions, to cause neurological disease (37),solid tumours/leukaemias (36, 38) and to express “endogenoussuperantigens” (39, 40). Furthermore, in MuLV infections, the geneticendogenous retroviral background of the mouse strain can determinesusceptibility or resistance to disease (39, 41). Indeed, suchinteractions between an infectious retrovirus and its endogenouscounterpart may be relevant in the pathogenesis of MS, since endogenousretroviral genotypes are not identical in all individuals. A geneticcontrol due to related endogenous retroviral genotypes could thereforecontribute to the known hereditary susceptibility to MS (43), if MSRVdoes indeed play an active role in this disease. Elsewhere, the data inTable 5 suggest that ERV9 elements may be co-expressed, possibly viatrans-activation in infected cells, and give rise to heterologous RNApackaging in MSRV virions. Such heterologous packaging is known to occurin other retroviral systems (42).

[0528] A Role for the Numerous Common Viruses Previously Evoked in MS?

[0529] Among the numerous reports of viruses putatively involved in theaetiopathogenesis of MS, a significant proportion focus on two viralfamilies, the paramyxoviridae and the herpesviridae. Regarding theparamyxoviridae, the key observation is of a frequently increasedantibody titer to measles virus in MS patients essentially directed, inCSF, against measles fusion protein (44). The existence of aminoacidsimilarities between conserved domains of the fusion proteins ofparamyxoviridae and the transmembrane protein of retroviruses (45), mayexplain this observation if antigenic cross-reactivity between these twoproteins occurred.

[0530] With regard to the herpesvirus family, the involvement ofEpstein-Barr Virus (EBV), Herpes Simplex Virus type 1 (HSV-1) and, mostrecently, Human Herpes Virus 6 (HHV-6) has been proposed (31, 46, 47).From our previous studies and from those of other groups, it appearsthat herpesviruses may play an important role in MSRV expression: wehave shown that HSV-1 immediate-early ICP0 and ICP4 proteins cantransactivate MSRV/LM7 in vitro (6) and Haahr et al. have proposed animportant epidemiological role for EBV, as a co-factor in MS, triggeringretrovirus reactivation (31). The recent description by Challoner et al.(47) showing significant expression of HHV6 proteins in MS plaques mayalso suggest a similar role for HHV6 in the brain.

EXAMPLE 19 MSRV Genome Detection Technique

[0531] Following 0.4 mm filtration to remove cellular debris and RNasedigestion to remove residual non-encapsidated RNA, serum was processedto extract viral RNA by means of adsorption to a silica matrix. ViralRNA was subjected to DNase digestion, then a combined reversetranscription-PCR (RT-PCR) reaction was performed using primers PTpol-A(sense: 5′xxxx3′, SEQ ID NO: 142) and PTpol-F (antisense: 5′xxxx3′, SEQID NO: 143). A second round of amplification with nested primers PTpol-B(sense: 5′xxxx3′, SEQ ID NO: 144) and PTpol-E (antisense: 5′xxxx3′, SEQID NO: 145) generated a 435 bp PCR product which was identified by gelelectrophoresis. The specificity of each product was confirmed bydideoxy sequencing. Control reactions without reverse transcriptase wereperformed to ensure that the products were derived from viral RNA. Inaddition, to exclude the possibility that the extracted viral RNA mightbe contaminated with host cell derived nucleic acids, aliquots weretested by nested PCR for the presence of pyruvate dehydrogenase (PDH)DNA and RNA. Samples which generated a signal in either the PDH or the“no-RT” PCR assays were excluded from the analysis.

[0532] Sera from patients with clinically active MS and controls wereamplified by RT-PCR and sequenced. Virion associated MSRV-RNA wasdetected in the serum of 10 of 19 (53%) patients with MS but in only 3of 44 controls without MS (P=0.0001). The control group consisted of 8patients (all MSRV-RNA negative) with rheumatological disorders and 36healthy adults. MSRV-RNA titers in both MS patients and controls wereapparently low because even moderate dilution of sera (<10 fold) causedloss of signal.

[0533] In MS patients, detection of MSRV-RNA was not associated withage, sex, disease duration, or MS type, however a significant negativecorrelation with treatment was observed. 26 serum samples were obtainedfrom the 19 patients; 100% of the sera from untreated patients containeddetectable MSRV-RNA whereas it was detectable in only 4 of 19 samples(21%) obtained during treatment with corticosteroids and/or azathioprine(P=0.001).

[0534] The reason for the apparent loss of virion associated MSRV-RNAduring immunosuppressive treatment is unknown but the finding is inagreement with the previous observations on the detection of MSRV incerebrospinal fluid. TABLE 7 DETECTION OF VIRION ASSOCIATED MSRV-RNA INMS UNTREATED PATIENTS & CONTROLS Positive Negative Total % PositiveControls without MS^(a) 3^(b) 41 44  7% MS sera untreated at time 7   0 7 100% of sampling

[0535] Method:

[0536] Modified SNAP RNA Extraction with Filtration and RNase Digestion

[0537] (All centrifugation are at room temperature)

[0538] Up to 500 microliters of serum is filtered using 0.45 micron spinfilters (Nanosep MF from Flowgen Catalogue No. U3-0126 Ref. ODM45). Theserum is spun for 5 min at 130,000 g (or for further 10 min ifnecessary).

[0539] 150 microliters of filtered serum is incubated with 10 unitsRNase One (Promega Catalogue No.M4261) for 30 min at 37° C.

[0540] The 150 microliters was then extracted using the SNAP RNAextraction kit (Invitrogen) as below:

[0541] 10 micrograms of poly A RNA was added to the 450 microliters ofBinding Buffer to act as a carrier; this was then added to the serum andmixed by inversion 6 times; 300 microliters of propan-2-ol was thenadded and mixed by inversion 10 times ; 500 microliters was transferredto the SNAP column and spun at 1300 g for 1 min and the flow-throughdiscarded ; the remainder was then added to the SNAP column and spun at1300 g for 1 min and the flow-through discarded; the column was thenwashed with 600 microliters of Super wash and the flow-throughdiscarded; the column was then washed with 600 microliters of 1×RNA washand the flow-through discarded; this wash was repeated with a 2 min 1300g spin and the flow-through discarded; the bound nucleic acid was theneluted by incubating with 135 microliters of RNase free water for 5 minand spun at 1300 g for 1 min.

[0542] 15 microliters of 10×DNAse buffer and 3 microliters (30 units) ofDNase I, RNase free (Boehringer Mannheim Cat. No. 776 785) was added andincubated for 30 min at 37° C. ; 450 microliters of Binding Buffer wasadded and mixed by inversion 6 times ; 300 microliters of propan-2-olwas then added and mixed by inversion 10 times ; 500 microliters wastransferred to the SNAP column and spun at 1300 g for 1 min and theflow-through discarded; the remainder was then added to the SNAP columnand spun at 1300 g for 1 min and the flow-through discarded; the columnwas then washed with 600 microliters 1×RNA wash and the flow-throughdiscarded; this wash was repeated with a 2 min 1300 g spin and theflow-through discarded; the bound nucleic acid was then eluted byincubating with 105 microliters of RNase free water for 5 min and spunat 1300 g for 1 min.

[0543] Titan RT-PCR

[0544] RT-PCR was performed using the Titan one tube RT-PCR system(Boehringer Mannheim Cat. No. 1 855 476) 25 microliters of RNA was usedin the combined RT-PCR reaction. The total reaction volume was 50microliters. Promega rRNAsin (10 units) was the RNase inhibitor used.170 ng of primers SEQ ID NO: 142 and SEQ ID NO: 143, respectively, wereused. A single master mix was prepared and the sample RNA added last.This was performed at room temperature, not on ice.

[0545] The RT step consisted of two sequential 30 min incubations at 50°C. and then 60° C. This was immediately followed by the PCR which hadthe following steps.

[0546] Initial denaturation of template at 94° C. for 2 min,

[0547] 40 cycles of 94° C. for 30 seconds; 60° C. for 30 seconds; 68° C.for 45 seconds,

[0548] 1 cycle of 68° C. for 7 min.

[0549] The second round PCR was performed using the Expand long templatePCR system (Boehringer Mannheim Cat. No. 1681 842). 0.5 microliters ofthe RT-PCR mix was added to 25 microliters of the round 2 PCR mix.Buffer No. 3 and 50 ng of primers B and E were used. The PCR had thefollowing steps:

[0550] 5 cycles of 94° C. for 30 seconds, 60° C. for 30 seconds., 68° C.for 45 seconds,

[0551] 1 cycle of 68° C. for 7 min.

[0552] The PCR products were then run on a 2% agarose gel.

[0553] The no RT controls were performed using “Expand” PCR system forboth rounds. The first round was 40 cycles and the second round 20cycles.

[0554] As a positive control a DNA dilution series was used in both theRT-PCR and the “no RT” PCR. For a result to be valid the RT-PCR and“no-RT” PCRs had to have detected DNA equivalent to between 1 and 0.1cells.

[0555] The analysis of PCR products of an approximately 435 bp fragmentin the pol region is shown in Table 8. TABLE 8 ANALYSIS OF PCR PRODUCTSWITH ORF* Exp Disease Clone ORF Fragment (bp) AA-RT Motif Site 46-7 MS1 + 429 YGDD 5 + 429 YGDD 8 + 429 YGDD 68-1 MS 41 + 438 YMDD 42 + 438YMDD 43 + 438 YMDD

[0556] Table 9, which data have been determined from the aliginents ofFIGS. 49 to 53, shows a variability:

[0557] between the clones obtained from the same patient plasma samplein the same PCR amplification experiment; this means that the patientpossesses a virion population which comprises different MSRV variantsfor a given time,

[0558] between the sequenced variant populations from differentpatients; this means that the variants differ from a patient to anotherone patient. TABLE 9 Degree of identity (percentage) between nucleotidesequences and between peptide sequences, by direct comparison of saidsequences (see FIGS. 49-53) Patient 68-1 46-7 Nucleotide between SEQ IDNo: 128 and between SEQ ID No: 135 and sequences MSRV-pol (SEQ ID No: 1)MSRV-pol (SEQ ID No: 1) 90.4% ^(b) 82.5% ^(a) 93.3% ^(a)   84% ^(b) SEQID Nos: 129, 130, 131 SEQ ID Nos: 136, 137, 138 between them betweenthem 98.6% ^(b) 94.5% ^(a) 98.7% ^(a) 95.1% ^(b) Peptide between SEQ IDNos: 132, between SEQ ID Nos: 139, sequences 133, 134 and trans 140, 141and trans of MSRV-1 of MSRV-1   81%   73.5%   SEQ ID Nos: 132, 133, 134SEQ ID Nos: 139, 140, 141 between them between them   97%     89%  

[0559] From FIGS. 53A and 53B, the variability between tested patientssequences can be determined:

[0560] between SEQ ID NO: 128 and SEQ ID NO: 135 : 16,5%^(a) and14,8%^(b)

[0561] between the peptide sequences obtained from SEQ ID NO: 128 andSEQ ID NO: 135: 20%.

BIBLIOGRAPHY

[0562] (1) Norrby E., Prog. Med. Virol., 1978; 24, 1-39.

[0563] (2) Johnson R. T., “Handbook of clinical neurology, 47Demyelinating diseases”, Vinken P. and Bruyn G. W., eds. Amsterdam,Elsevier Science Publishing, 1985, 319-336.

[0564] (3) Perron H. et al., Res. Virol. 1989, 140, 551-561.

[0565] (4) Perron H. et al., “Current concepts in multiple sclerosis”Wiethölter et al., eds. Amsterdam, Elsevier, 1991, 111-116.

[0566] (5) Perron H. et al., The Lancet 1991, 337, 862-863.

[0567] (6) Perron H. et al., J. Gen. Virol. 1993, 74, 65-72.

[0568] (7) Fields and Knipe, Fundamental Virology 1986, Rev Press N.Y.

[0569] (8) Nielsen P. E. et al., Science 1991; 254, 1497-1500.

[0570] (9) Maniatis et al., Molecular Cloning, Cold Spring Harbour,1982.

[0571] (10) Southern. E. M., J. Mol. Biol. 1975, 98, 503.

[0572] (11) Dunn A. R. and Hassel J. A., Cell 1977, 12, 23,

[0573] (12) Shih et al., J. Virol. 1989, 63, 64-75.

[0574] (13) Perron H. et al., Res. Vir. 1992, 143, 337-350.

[0575] (14) Meyerhans et al., Cell 1989, 58, 901-910.

[0576] (15) Linial M. L. and Miller A. D., “Current topics inmicrobiology and immunobiology. Retroviruses, strategies of replication”vol. 157, 125-152; Swanstrom R. and Vogt P. K., editors,Springer-Verlag, Heidelberg 1990.

[0577] (16) Lori F. et al., J. Virol. 1992, 66, 5067-5074.

[0578] (17) Sambrook J., Fritsch E. F. and Maniatis T., Molecularcloning, a laboratory manual. Cold Spring Harbour Laboratory Press,1989.

[0579] (18) La Mantia et al., Nucleic Acids Research 1991, 19,1513-1520.

[0580] (19) Gonzales-Quintial R, Baccala R, Pope R M and TheofilopoulosN, J. Clin. Invest, Vol. 97, Number 5, pp1335-1343, 1996.

[0581] (20) Chomczynski P. and N. Sacchi, Analytical Biochemistry 1987,162, 156-159.

[0582] (21) F. Mallet et al., Journal of Clinical Microbiology 1993; 31,1444-1449.

[0583] (22) G. Barany and R. B. Merrifield, 1980, In the Peptides, 2,1-284, Gross E and Meienhofer J, Eds., Academic Press, New York.

[0584] (23) Poser et al., Gbers G. C. eds. The diagnosis of multiplesclerosis Thieme Stratton Inc, New York 1984: 225-229.

[0585] (24) La Mantia et al., Nucleic Acid Research 1989, 17, 5913-22.

[0586] (25) PLAZA, A; KONO, D. H.; THEOFILOPOULOS, A.N. NEW HUMAN VbGENES and POLYMORPHIC VARIANTS. J. Imm; 147(12): 4360-4365, 1991.

[0587] (26) H. Perron, In vitro transmission and antigenicity of aretrovirus isolated from multiple sclerosis, Res. Virol. 143, 337-350(1992).

[0588] (27) J. Garson et al., Development of a “Pan-retrovirus”detection system for multiple isclerosis studies. Acta Neurol. Scand.(in Press).

[0589] (28) F. Mallet, G. Oriol, C. Mary, B. Verrier and B. Mandrand.Continuous RT-PCR and taq DNA polymerase: Characterization andcomparison to uncoupled procedures. Biotechniques 18, 678-687 (1995).

[0590] (29) R. Baccala, D. H. Kono, S. Walker, R. S. Balderas andTheophilopoulos. Genomically imposed and somatically modified humanthymocyte vb gene repertoires. Proc. Natl. Acad. Sci. USA (1991) 88,2908.

[0591] (30) Haahr S., Koch-Henriksen N., Moller-Larsen, A. Eriksen L. S.& Andersen H. M. K. Increased risk of multiple sclerosis after lateEpstein-Barr virus infection: a historical prospective study. MultipleSclerosis 1, 73-77 (1995).

[0592] (31) Haahr S et al. A putative new retrovirus associated withmultiple sclerosis and the possible involvement of Epstein-Barr virus inthis disease. Ann. NY Acad. Science. 724, 148-156 (1994).

[0593] (32) Wilkinson D. A., Goodchild N. L., Saxton T. M., Wood S. &Mager D. L. Evidence for a functional subclass of the RTLV-H family ofhuman endogenous retrovirus-like sequences. J. Virol. 67, 2981-2989(1993).

[0594] (33) Sonigo P., Barker C., Hunter E. and Wain-Hobson S.Nucleotide sequence of Mason-Pfizer monkey virus: an immunosuppressiveD-type retrovirus. Cell 45, 375-85 (1986).

[0595] (34) Rhee S. S., and Hunter E. A single amino acid substitutionwithin the matrix protein of a D-type retrovirus converts itsmorphogenesis to that of a C-type retrovirus. Cell 63, 77-86 (1990).

[0596] (35) Bai J., Zhu R. Y., Stedman K., Cousens C., Carlson J., SharpJ. M. and DeMartini J. C. Unique long terminal repeat U3 sequencesdistinguish exogenous Jaagsiekte sheep retroviruses associated withovine pulmonary carcinoma from endogenous loci in the sheep genome. J.Virol. 70, 3159-3168 (1996).

[0597] (36) Palmarini M., Cousens C., Dalziel R. G., Bai J., Stedman K.,DeMartini J. C. and Sharp J. M. The exogenous form of Jaagsiekteretrovirus is specifically associated with a contagious lung cancer ofsheep. J. Virol. 70, 1618-1623 (1996).

[0598] (37) Portis J. L. Wild mouse retrovirus: pathogenesis. in“Retrovirus infections of the nervous system”. Oldstone M. B. A. andKoprowsky H. Eds. Current topics in microbiology and immunology, n°160,p. 11-27. (Springer-Verlag, Berlin 1990).

[0599] (38) Gardner M. B., Chivi A., Dougherty M. F., Casagrande J &Estes J. D. Congenital transmission of murine leukaemia virus from wildmice prone to development of lymphoma and paralysis. J. Natl. CancerInst. 62, 63-69 (1979).

[0600] (39) Marrack P., Kushnir E. & Kappler J. A maternally inheritedsuperantigen encoded by a mammary tumor virus. Nature 349, 524-526(1991).

[0601] (40) Hügin A. W., Vacchio M. S. & Morse H. C. A virus-encodedsuperantigen in a retrovirus-induced immunodeficiency syndrome of mice.Science 252, 424-427 (1991).

[0602] (41) Gardner M. B. Genetic resistance to a retroviral neurologicdisease in wild mice, in “Retrovirus infections of the nervous system”Oldstone M. B. A. and Koprowsky H. Eds. Current topics in microbiologyand immunology, n° 160, p. 3-10. (Springer-Verlag, Berlin 1990).

[0603] (42) Linial M. L. & Miller A. D. Retroviral RNA packaging:sequence requirements and implications, in “Retroviruses-strategies ofreplication” Swanstrom R. & Vogt P. K. Eds. Current topics inmicrobiology and immunology, n° 157, p. 125-152. (Springer-Verlag.Berlin 1990).

[0604] (43) Bell J. I. and Lathrop G. M. Multiple loci for multiplesclerosis. Nature Genetics 13, 377-78 (1996).

[0605] (44) Dhib-Jalbut S., Lewis K., Bradbum E., McFarlin D. E. andMcFarland H. F. Measles virus polypeptide-specific antibody profile inmultiple sclerosis. Neurology, 1990; 40: 430-435.

[0606] (45) Gonzalez-Scarano F., Waxham M. N., Ross A. M. and Hoxie J.A. Sequence similarities between human immunodeficiency virus gp41 andParamyxovirus fusion proteins. AIDS Res. Hum. Retrov. 1987 ; 3 :245-252.

[0607] (46) Bergström, T., Andersen, O. & Vahlne A.(1989).Isolation ofherpes virus type 1 during first attack of multiple sclerosis. AnnalesNeurology 26, 283-285.

[0608] (47) Challoner P. B. et al. Plaque-associated expression of humanherpesvirus 6 in multiple sclerosis. Proc. Natl. Acad. Sci. USA 92,7440-7444 (1995).

[0609] (48) A. Gessain et al; Antibodies to Human T-Lymphotrophic Virustype-I in patients with tropical spastic paraparesis. Lancet 2, 407-410(1985).

[0610] (49) H. Perron, J. A. Garson, F. Bedin et al., Molecularidentification of a novel retrovirus repeatedly isolated from patientswith multiple sclerosis. Proc. Nat. Acad. Sci. USA 94:7583-7588 (1997).

1 210 1158 base pairs nucleotide single linear cDNA 1 CCCTTTGCCACTACATCAAT TTTAGGAGTA AGGAAACCCA ACGGACAGTG GAGGTTAGTG 60 CAAGAACTCAGGATTATCAA TGAGGCTGTT GTTCCTCTAT ACCCAGCTGT ACCTAACCCT 120 TATACAGTGCTTTCCCAAAT ACCAGAGGAA GCAGAGTGGT TTACAGTCCT GGACCTTAAG 180 GATGCCTTTTTCTGCATCCC TGTACGTCCT GACTCTCAAT TCTTGTTTGC CTTTGAAGAT 240 CCTTTGAACCCAACGTCTCA ACTCACCTGG ACTGTTTTAC CCCAAGGGTT CAGGGATAGC 300 CCCCATCTATTTGGCCAGGC ATTAGCCCAA GACTTGAGTC AATTCTCATA CCTGGACACT 360 CTTGTCCTTCAGTACATGGA TGATTTACTT TTAGTCGCCC GTTCAGAAAC CTTGTGCCAT 420 CAAGCCACCCAAGAACTCTT AACTTTCCTC ACTACCTGTG GCTACAAGGT TTCCAAACCA 480 AAGGCTCGGCTCTGCTCACA GGAGATTAGA TACTNAGGGC TAAAATTATC CAAAGGCACC 540 AGGGCCCTCAGTGAGGAACG TATCCAGCCT ATACTGGCTT ATCCTCATCC CAAAACCCTA 600 AAGCAACTAAGAGGGTTCCT TGGCATAACA GGTTTCTGCC GAAAACAGAT TCCCAGGTAC 660 ASCCCAATAGCCAGACCATT ATATACACTA ATTANGGAAA CTCAGAAAGC CAATACCTAT 720 TTAGTAAGATGGACACCTAC AGAAGTGGCT TTCCAGGCCC TAAAGAAGGC CCTAACCCAA 780 GCCCCAGTGTTCAGCTTGCC AACAGGGCAA GATTTTTCTT TATATGCCAC AGAAAAAACA 840 GGAATAGCTCTAGGAGTCCT TACGCAGGTC TCAGGGATGA GCTTGCAACC CGTGGTATAC 900 CTGAGTAAGGAAATTGATGT AGTGGCAAAG GGTTGGCCTC ATNGTTTATG GGTAATGGNG 960 GCAGTAGCAGTCTNAGTATC TGAAGCAGTT AAAATAATAC AGGGAAGAGA TCTTNCTGTG 1020 TGGACATCTCATGATGTGAA CGGCATACTC ACTGCTAAAG GAGACTTGTG GTTGTCAGAC 1080 AACCATTTACTTAANTATCA GGCTCTATTA CTTGAAGAGC CAGTGCTGNG ACTGCGCACT 1140 TGTGCAACTCTTAAACCC 1158 297 base pairs nucleotide single linear cDNA 2 CCCTTTGCCACTACATCAAT TTTAGGAGTA AGGAAACCCA ACGGACAGTG GAGGTTAGTG 60 CAAGAACTCAGGATTATCAA TGAGGCTGTT GTTCCTCTAT ACCCAGCTGT ACCTAACCCT 120 TATACAGTGCTTTCCCAAAT ACCAGAGGAA GCAGAGTGGT TTACAGTCCT GGACCTTAAG 180 GATGCCTTTTTCTGCATCCC TGTACGTCCT GACTCTCAAT TCTTGTTTGC CTTTGAAGAT 240 CCTTTGAACCCAACGTCTCA ACTCACCTGG ACTGTTTTAC CCCAAGGGTT CAAGGGA 297 85 base pairsnucleotide single linear cDNA 3 GTTTAGGGAT ANCCCTCATC TCTTTGGTCAGGTACTGGCC CAAGATCTAG GCCACTTCTC 60 AGGTCCAGSN ACTCTGTYCC TTCAG 85 86base pairs nucleotide single linear cDNA 4 GTTCAGGGAT AGCCCCCATCTATTTGGCCA GGCACTAGCT CAATACTTGA GCCAGTTCTC 60 ATACCTGGAC AYTCTYGTCCTTCGGT 86 85 base pairs nucleotide single linear cDNA 5 GTTCARRGATAGCCCCCATC TATTTGGCCW RGYATTAGCC CAAGACTTGA GYCAATTCTC 60 ATACCTGGACACTCTTGTCC TTYRG 85 85 base pairs nucleotide single linear cDNA 6GTTCAGGGAT AGCTCCCATC TATTTGGCCT GGCATTAACC CGAGACTTAA GCCAGTTCTY 60ATACGTGGAC ACTCTTGTCC TTTGG 85 111 base pairs nucleotide single linearcDNA 7 GTGTTGCCAC AGGGGTTTAR RGATANCYCY CATCTMTTTG GYCWRGYAYT RRCYCRAKAY60 YTRRGYCAVT TCTYAKRYSY RGSNAYTCTB KYCCTTYRGT ACATGGATGA C 111 645 basepairs nucleotide single linear cDNA 8 TCAGGGATAG CCCCCATCTA TTTGGCCAGGCATTAGCCCA AGACTTGAGT CAATTCTCAT 60 ACCTGGACAC TCTTGTCCTT CAGTACATGGATGATTTACT TTTAGTCGCC CGTTCAGAAA 120 CCTTGTGCCA TCAAGCCACC CAAGAACTCTTAACTTTCCT CACTACCTGT GGCTACAAGG 180 TTTCCAAACC AAAGGCTCGG CTCTGCTCACAGGAGATTAG ATACTNAGGG CTAAAATTAT 240 CCAAAGGCAC CAGGGCCCTC AGTGAGGAACGTATCCAGCC TATACTGGCT TATCCTCATC 300 CCAAAACCCT AAAGCAACTA AGAGGGTTCCTTGGCATAAC AGGTTTCTGC CGAAAACAGA 360 TTCCCAGGTA CASCCCAATA GCCAGACCATTATATACACT AATTANGGAA ACTCAGAAAG 420 CCAATACCTA TTTAGTAAGA TGGACACCTACAGAAGTGGC TTTCCAGGCC CTAAAGAAGG 480 CCCTAACCCA AGCCCCAGTG TTCAGCTTGCCAACAGGGCA AGATTTTTCT TTATATGCCA 540 CAGAAAAAAC AGGAATAGCT CTAGGAGTCCTTACGCAGGT CTCAGGGATG AGCTTGCAAC 600 CCGTGGTATA CCTGAGTAAG GAAATTGATGTAGTGGCAAA GGGTT 645 741 base pairs nucleotide single linear cDNA 9CAAGCCACCC AAGAACTCTT AAATTTCCTC ACTACCTGTG GCTACAAGGT TTCCAAACCA 60AAGGCTCAGC TCTGCTCACA GGAGATTAGA TACTTAGGGT TAAAATTATC CAAAGGCACC 120AGGGGCCTCA GTGAGGAACG TATCCAGCCT ATACTGGGTT ATCCTCATCC CAAAACCCTA 180AAGCAACTAA GAGGGTTCCT TAGCATGATC AGGTTTCTGC CGAAAACAAG ATTCCCAGGT 240ACAACCAAAA TAGCCAGACC ATTATATACA CTAATTAAGG AAACTCAGAA AGCCAATACC 300TATTTAGTAA GATGGACACC TAAACAGAAG GCTTTCCAGG CCCTAAAGAA GGCCCTAACC 360CAAGCCCCAG TGTTCAGCTT GCCAACAGGG CAAGATTTTT CTTTATATGG CACAGAAAAA 420ACAGGAATCG CTCTAGGAGT CCTTACACAG GTCCGAGGGA TGAGCTTGCA ACCCGTGGCA 480TACCTGAATA AGGAAATTGA TGTAGTGGCA AAGGGTTGGC CTCATNGTTT ATGGGTAATG 540GNGGCAGTAG CAGTCTNAGT ATCTGAAGCA GTTAAAATAA TACAGGGAAG AGATCTTNCT 600GTGTGGACAT CTCATGATGT GAACGGCATA CTCACTGCTA AAGGAGACTT GTGGTTGTCA 660GACAACCATT TACTTAANTA TCAGGCTCTA TTACTTGAAG AGCCAGTGCT GNGACTGCGC 720ACTTGTGCAA CTCTTAAACC C 741 93 base pairs nucleotide single linear cDNA10 TGGAAAGTGT TGCCACAGGG CGCTGAAGCC TATCGCGTGC AGTTGCCGGA TGCCGCCTAT 60AGCCTCTACA TGGATGACAT CCTGCTGGCC TCC 93 96 base pairs nucleotide singlelinear cDNA 11 TTGGATCCAG TGYTGCCACA GGGCGCTGAA GCCTATCGCG TGCAGTTGCCGGATGCCGCC 60 TATAGCCTCT ACGTGGATGA CCTSCTGAAG CTTGAG 96 748 base pairsnucleotide single linear cDNA 12 TGCAAGCTTC ACCGCTTGCT GGATGTAGGCCTCAGTACCG GNGTGCCCCG CGCGCTGTAG 60 TTCGATGTAG AAAGCGCCCG GAAACACGCGGGACCAATGC GTCGCCAGCT TGCGCGCCAG 120 CGCCTCGTTG CCATTGGCCA GCGCCACGCCGATATCACCC GCCATGGCGC CGGAGAGCGC 180 CAGCAGACCG GCGGCCAGCG GCGCATTCTCAACGCCGGGC TCGTCGAACC ATTCGGGGGC 240 GATTTCCGCA CGACCGCGAT GCTGGTTGGAGAGCCAGGCC CTGGCCAGCA ACTGGCACAG 300 GTTCAGGTAA CCCTGCTTGT CCCGCACCAACAGCAGCAGG CGGGTCGGCT TGTCGCGCTC 360 GTCGTGATTG GTGATCCACA CGTCAGCCCCGACGATGGGC TTCACGCCCT TGCCACGCGC 420 TTCCTTGTAG ANGCGCACCA GCCCGAAGGCATTGGCGAGA TCGGTCAGCG CCAAGGCGCC 480 CATGCCATCT TTGGCGGCAG CCTTGACGGCATCGTCGAGA CGGACATTGC CATCGACGAC 540 GGAATATTCG GAGTGGAGAC GGAGGTGGACGAAGCGCGGC GAATTCATCC GCGTATTGTA 600 ACGGGTGACA CCTTCCGCAA AGCATTCCGGACGTGCCCGA TTGACCCGGA GCAACCCCGC 660 ACGGCTGCGC GGGCAGTTAT AATTTCGGCTTACGAATCAA CGGGTTACCC CAGGGCGCTG 720 AAGCCTATCG CGTGCAGTTG CCGGATGC 74818 base pairs nucleotide single linear cDNA 13 GCATCCGGCA ACTGCACG 18 20base pairs nucleotide single linear cDNA 14 GTAGTTCGAT GTAGAAAGCG 20 23base pairs nucleotide single linear cDNA 15 AGGAGTAAGG AAACCCAACG GAC 2319 base pairs nucleotide single linear cDNA 16 TAAGAGTTGC ACAAGTGCG 1921 base pairs nucleotide single linear cDNA 17 TCAGGGATAG CCCCCATCTA T21 24 base pairs nucleotide single linear cDNA 18 AACCCTTTGC CACTACATCAATTT 24 15 base pairs nucleotide single linear cDNA 5, 7, 10, 13 Nrepresents inosine (i) 19 GGTCNTNCCN CANGG 15 21 base pairs nucleotidesingle linear cDNA 20 TTAGGGATAG CCCTCATCTC T 21 23 base pairsnucleotide single linear cDNA 21 GCGTAAGGAC TCCTAGAGCT ATT 23 18 basepairs nucleotide single linear cDNA 22 TCATCCATGT ACCGAAGG 18 20 basepairs nucleotide single linear cDNA 23 ATGGGGTTCC CAAGTTCCCT 20 20 basepairs nucleotide single linear cDNA 24 GCCGATATCA CCCGCCATGG 20 20 basepairs nucleotide single linear cDNA 25 CGCGATGCTG GTTGGAGAGC 20 20 basepairs nucleotide single linear cDNA 26 TCTCCACTCC GAATATTCCG 20 26 basepairs nucleotide single linear cDNA 27 GATCTAGGCC ACTTCTCAGG TCCAGS 2623 base pairs nucleotide single linear cDNA 6, 12, 19 N representsinosine (i) 28 CATCTNTTTG GNCAGGCANT AGC 23 24 base pairs nucleotidesingle linear cDNA 29 CTTGAGCCAG TTCTCATACC TGGA 24 22 base pairsnucleotide single linear cDNA 30 AGTGYTRCCM CARGGCGCTG AA 22 22 basepairs nucleotide single linear cDNA 31 GMGGCCAGCA GSAKGTCATC CA 22 22base pairs nucleotide single linear cDNA 32 GGATGCCGCC TATAGCCTCT AC 2222 base pairs nucleotide single linear cDNA 33 AAGCCTATCG CGTGCAGTTG CC22 40 base pairs nucleotide single linear cDNA 34 TAAAGATCTA GAATTCGGCTATAGGCGGCA TCCGGCAAGT 40 50 amino acids amino acid Not Relevant linearpeptide 35 Asp Ala Phe Phe Cys Ile Pro Val Arg Pro Asp Ser Gln Phe LeuPhe 1 5 10 15 Ala Phe Glu Asp Pro Leu Asn Pro Thr Ser Gln Leu Thr TrpThr Val 20 25 30 Leu Pro Gln Gly Phe Arg Asp Ser Pro His Leu Phe Gly GlnAla Leu 35 40 45 Ala Gln 50 150 base pairs nucleic acid single linearcDNA 36 GATGCCTTTT TCTGCATCCC TGTACGTCCT GACTCTCAAT TCTTGTTTGCCTTTGAAGAT 60 CCTTTGAACC CAACGTCTCA ACTCACCTGG ACTGTTTTAC CCCAAGGGTTCAGGGATAGC 120 CCCCATCTAT TTGGCCAGGC ATTAGCCCAA 150 11 amino acids aminoacid Not Relevant linear peptide 37 Cys Ile Pro Val Arg Pro Asp Ser GlnPhe Leu 1 5 10 17 amino acids amino acid Not Relevant linear peptide 38Val Leu Pro Gln Gly Phe Arg Asp Ser Pro His Leu Phe Gly Glu Ala 1 5 1015 Leu 8 amino acid amino acid Not Relevant linear peptide 39 Leu PheAla Phe Glu Asp Pro Leu 1 5 8 amino acids amino acid Not Relevant linearpeptide 40 Phe Ala Phe Glu Asp Pro Leu Asn 1 5 25 base pairs nucleicacid single linear cDNA 41 GTGCTGATTG GTGTATTTAC AATCC 25 1859 basepairs nucleic acid single linear cDNA 42 GTGCTGATTG GTGTATTTACAATCCTTTAT CTAATCCGAA ATGCCCATGT TGCAATATGG 60 AAAGAAAGGG AGTTCCTAACCTCTGGGGGA ACCCCCATTA AATACCACAA GTAAATCATG 120 GAGTTATTGC ACACAGTGCAAAAACTCAAG GAGGTGGAAG TCTTACACTG CCAAAGCCAT 180 CAGAAAAGGG AAGAGGGGAGAAGAGCAGCA TAAGTGGCTA CAGAGGCAAG GAAAGACTAG 240 CAGAAAGGAA AGAGAGAAAGAGACAGAAAG TCAGAGAGAG AGAGAGGAAG AGACAGAGCA 300 CAAAGAGGGA GTCAGAGAGAGAGAGAGACA GAGAGTCAGA GAGAAGGAAA GAGAGAGAGG 360 AAGAGACAAA GAATGAATCAAACAGAGAGA CAGAAAGTCA GAGAGAGAGA GAGAGAGGAA 420 GAGACAGAGA AAAAGAGGGAGTCAGAAAAA GAGAGACCAA AGAAGAAGTC CAAAGAGAAA 480 GAAAGAGAGA TGGAAGTAGTAAAGGAAAAA CAGTGTACCC TATTCCTTTA AAAGCCGGGG 540 TAAATTTAAA ACCTATAATTGATAACTGAA GGTCTTCTCT GTAACCCTGT AACACTCCAA 600 TACCACCTTG TTGTCAAGTGTAAACAAGGG CGTAGCCCAA AAGCACTGAG GCCACTAACA 660 ACCCATAGCC TTCCTATCAAAATTCCTTAA CCCAGCAGGT TTCCTAACAG GGGATCTAAA 720 TCTTAATTAA TTACCATACAATGGTCCAAC CAGACTTAGG AGGAATTCCC TTCAGGACGG 780 GAAGATAGAT GCTTCCTCCCAGGCGATTAA GGGAGAAAGA CACAATGGGT ATTCAGTAAG 840 TGCCAAGGGG AACACTTGTAGAAGCAAAGT TAGGAAAATT GCCAAATAAT TGGTTTGCTC 900 AAGAGTTGTT TGCACTCAGCCAAACCTTGA AGTACTTGCA GAATCAGAAA GGAGCCATCT 960 ATACCAATTC TAAGTTAATATGGACTGAAG GAGGTTTTAT TAATACCAAA GAGAAATTAA 1020 AATCCCAAAC TTATAAGGTTTTCAACCAAA GTAAAGTTTG CTAAAAGTTA ACAGCGTAAC 1080 ATGTATTATC CTACTACCACACACTCTCAA AGGATTTCTC AGACAGTTTG CAAGAAATAA 1140 TGATATCTAT CCTTACTCTACAATCCCAAA TAGACTCTTT GGCAGCAGTG ACTCTCCAAA 1200 ACCGTCAAGG CCTAGACCTCCTCACTGCTG AGAAAGGAGG ACTCTGCACC TTCTTAAGGG 1260 AAGAGTGTTG TCTTTACACTAACCAGTCAG GGATAGTATG AGATGCTGCC CGGCATTTAC 1320 AGAAAAAGGC TTCTGAAATCAGACAACGCC TTTCAAATTC CTATACCAAC CTCTGGAGTT 1380 GGGCAACATG GTTTCTTCCCTTTCTATGTC CCATGGCTGC CATCTTGCTA TTACTCGCCT 1440 TTGGGCCCTG TATTTTTAACCTCCTTGTCA AATTTGTTTC TTCTAGGATC GAGGCCATCA 1500 AGCTACAGAT GGTCTTACAAATGGAACCCC AAATGAGCTC AACTATCAAC TTCTACTGAG 1560 GACCCCTAGA CCAACCCCCTGGCCCTTTCA CTGGCCTAAA GAGTTCCCCT CTGGAGGACA 1620 CTACCACTGC AGGGCCCCATCTTTGCCCCT ATCCAGAAGG AAGTAGCTAG AGCAGTCATT 1680 GCCCAATTCC CAAGAGCAGCTGGGGTGTCC CGTTTAGAGT GGGGATTGAG AGGTGAAGCC 1740 AGCTGGACTT CTGGGTCGGGTGGGGACTTG GAGAACTTTT GTGTCTAGCT AAAGGATTGT 1800 AAATGCAACA ATCAGTGCTCTGTGTCTAGC TAAAGGATTG TAAATACACC AATCAGCAC 1859 23 base pairs nucleicacid single linear cDNA 43 TGATGTGAAC GGCATACTCA CTG 23 24 base pairsnucleic acid single linear cDNA 44 CCCAGAGGTT AGGAACTCCC TTTC 24 25 basepairs nucleic acid single linear cDNA 45 GCTAAAGGAG ACTTGTGGTT GTCAG 2522 base pairs nucleotide single linear cDNA 46 CAACATGGGC ATTTCGGATT AG22 400 base pairs nucleotide single linear cDNA 47 GGCTGCTAAA GGAGACTTGTGGTTGTCAGA CAATCGCCTA CTTAGGTACC AGGCCTTATT 60 ACTTGAGGGA CTGGTGCTTCAGATGCGCAC TTGTGCAGCT CTTAACCCAA ACTTATGCTG 120 CCCAGAAGGA TCTTTTAGAGGTCCCCTTAG CCAACCCTGA CCTCAACCTA TATATATACT 180 GATGGAAGTT CGTTTGTAGAAAAGGGATTA CAAAGGGNAG GATATNCCAT AGGTTAGTGA 240 TAAAGCAGTA CTTGAAAGTAAGCCTCTTCC CCCCAGGGAC CAGCGCCCCC GTTAGCAGAA 300 CTAGTGGCAC TGACCCCGAGCCTTAGAACT TGGAAAGGGA GGAGGATAAA TGTGTATACA 360 GATAGCAAGT ATGCTTATCTAATCCGAAAT GCCCATGTTG 400 2389 base pairs nucleotide single linear cDNA48 TCAGGGATAG CCCCCATCTA TTTGGTCAGG CACTGGCCCA AGATCTAGGG ACATGCCACT 60TTTAAGAGCC ATTTCTCAAG TCCAGGTACT CTGGTCCTTC GGTATGTGGA TGATTTACTT 120TTGGCTACCA GTTCAGTAGC CTCATGCCAG CAGGCTACTC TAGATCTCTT GAACTTTCTA 180GCTAATCAAG GGTACAAGGC ATCTAGGTTG AAGGCCCAGC TTTGCCTACA GCAGGTCAAA 240TATCTAGGCC TAATCTTAGC CAGAGGGACC AGGGCACTCA GCAAGGAACA AATACAGCCT 300ATACTGGCTT ATCCTCACCC TAAGACATTA AAACAGTTGC GGGGGTTCCT TGGAATCACT 360GGCTTTTTGG TGACTATGGA TTCCCAGATA CAGCAAGATT GGCAGGCCCC TCTATACTGT 420AATCAAGGAG ACTCACGAGG GCAAGTACTC ATCTAGTAGA ATGGGAACTA GGGACAGAAA 480CAGCCTTCAA AACCTTAAAG CAGGCCCTAG TACAATCTCC AGCTTTAAGC CTTCCCACAG 540GACAAAACTT CTCTTTATAC ATCACAGAGA GGGCAGAGAT AGCTCTTGGT GTCCTTATTC 600AGACTCATGG GACTACCCCA CAACCAGTGG CACACCTAAG TAAGGAAATT GATGTAGTAG 660CAAAAGGCTG GCCTCACTGT TTATGGGTAG CTGTGGTGGT GGCTGTCTTA GTGTCAGAAG 720CTATCAAAAT AATACAAGGA AAGGATCTCA CTGTCTGGAC TACTCATGAT GTAATGGCAT 780ACTAGGTGCC AAAAGAAGTT TATGGGTATC AGACAACCAC CTGCTTAGAT ACCAGGGACT 840ACTCCTGGAG GATTGGGCTT CAAGTGCGTT TTTTGTGGCC TCAACCCTGC CACTTTTCCT 900CCAGAGGATG GAGAGCCGCT TGAGCATGCT TGCCAACAGG TTGTAGGCCA GAATTATTCC 960ACCCGAGATG ATCTCTTAGA GTACCCTTAG CTAATCCTGA CCTTAACCTA TATACCAATG 1020GAAGTTCATT TGTGGAAAAC GGGATATGAA GGGCAGGTTA TGTCATAGTT AGTGATGTAA 1080TCATACTTGC AAGTAAGCCT CTTACCCCAG GGGCCAGCAC TCAGTTAGCA GAACTAGTCA 1140CACTTACCTT AACCTTAGAA CTGGGAAAGG GAAAAAGAAT AAATATGTAT ACAGATAGTA 1200AGTATGCTTA TCTAATCCTA CATGCCCATG CTGCAATATG GAAGGAAAGG GAGTTCCTAA 1260CCCCTGGGGG AACCCCCATT AAATACCACA AGGYAAATCA TGGAGTTATT GCACGCAGTG 1320CAAAAACTCA AGGAGGTGGC AGTCTTACAC TGCCGAAGCY ATCAAAAAGG GGAAGGAGAG 1380GGGAGAACAG CAGCATAAGT GGTTGGCAGA GGCAGTGAAA GACCAGCAGA GAGAAGGAGA 1440GAGACAACGT CAACGACAGA AGGAAAGAAG AGGAGGAGAC AGAGAGGAAG AGACAGAGAG 1500ACAGTTAGTC CAAGAGAGAG ACAGAGAGAG GAAGAGACAG ACAGAAAGTC CAAGAGAGAA 1560GGAAAGAGAG GAAGAGACCA AGGAGTCCNA GAGAGAGAAA GAGATAGAAG TAGTAAAGAA 1620AAAACATTGT ACCCTATTCC TTTAAAAGCC GGGGTATATT TAAAACCTAT AATTGATAAT 1680TGAGTTCTTG CACCCTCCTC CAGGGGATYG CTGGGAGGAA ACCCTCAACC GATATGTGAA 1740AATTGTGGGT CGTCCCTATG TCTCAATTAC CAGCCAATAC CCCCTTGTTT TTAGTGTGAA 1800CGAGGGTGTA GAGCGCAGAC AGGGAGACCT CTGACAATCC ATACCCTTCC TATCCAAAAT 1860CCTTAACCCA GCAGGTTTTC TAAAAGGGGA TCTAAATCTT AATTAATTAC CATACAAAGG 1920TCAAACCAGA TCTAGGAGGA ACTTCCTTCA GGACAGGATG ATAGATGGTT CCTCCCAGGC 1980GATTAAAGAA AATAAAAAGA CACATGGGCA GCCAGTAAGT GATAAGGGAA CACTAGTAGA 2040AGCAGTTAGG AGAAGTTGCC TAATAATTGG TCTACTCCAA ATGTGTGAGT TGTTCGCACT 2100CAGCCCAAAT CTTAAAGTAC TTACAGAATT AGGGAGGAGC CATTTACACC AATTCTAAGT 2160TAATATGGAC TGGATGAGGT TTTATTAATA GCGAAGGAGA ATTAAATCCT AAACTNACAA 2220GGTTTTCAAC TAAAGTAAAT TTTACTAAAA GCTAACAGTG TAACATGCAT TATCCTACTA 2280CAACACACTC TCANAGGATT CCTCAGACAG TTTACAAGAA ATAACAAAAT CTATCTGGTA 2340AGGATAGTAA CTACAATCCC AAATACATTC TTTGGCAGCA GTGACTCTC 2389 2448 basepairs nucleotide single linear cDNA 49 TCAGGGATAG CCCCCATCTA TTTGATCAGGCACTAGCCCA AGATCTAGGC CACTTCTGAA 60 GTCCAGGCAT TCTAGTCCTT CAGTATGTGGATGATTTACT TTTGGCTACC AGTTTGGAAG 120 CCTCATGCCA GCAGGCTACT TGAGATCTCTTGAACTTTCT AGCTAATCAA GGGTGTATGG 180 CATCTAAATT GAAAGTCCAG CTCTGCCTACAACAAGTCAA ATATCTAGGC CTAATCTTAG 240 ATAGAAGAAC CAGGGCCCTC AGCAAGGAATGAATAAAGCC TATGCTGGCT TATCGGCACC 300 CTAAGACATT AAAACAATTG TGGGGGTTCCTTGGAATCAC TGGCTTTTGC CGACTATGGA 360 TCCCTGGATA GAGTGAGATA GCCAGGCCCCCTCTATTACT CTTATCAAGG AGACCCAGAG 420 GGCAAATACT TATCTAGTAT TATGGGNACCAGAGGCAGAA AAAGCCTTCC AAACCTTAAA 480 GGAGACCCTA GTACAAGCTC CAGCTTTAAGCCTTCCCACA GGACAAANCT TCTCTTTATA 540 TGTCACAGAG AGAGCAGGAA TAGCTCCTGGAGTCCTTACT CAGACTTTTG GACGACCCCA 600 CGGCCAGTGG CRTACCTAAG TAAGGAAATTGATGTAGTAG CAAAAGGCTG GCCTCACTGT 660 TTATGGGTAG TTGCGGCTGT GGCAGTCTTACTGTCAAAGG CTATCAAAAT AATACAAGGA 720 AAGGATTTCA CTATCTGGAC TACTCATGAGGAAAATGGCA TATTAGGTGC CAAAGGAAGT 780 TTTTGGCTAT CAGACAACCA CCTGCTCAGATTCCAGGCAC TACTGATTGA GAGACCAGTG 840 CTTTAAATAT GTATGTGTGT GTGTGGCCCTCAACCCTGCC ACTGTTCTCC CAGAAGATGG 900 AGAACCAATG AAGCATTACT GTCAACAAATTAGAGTCCAG AGTTATGCTG CCTGAGAGGA 960 TCTCTTAGAA GTCCCCTTAG CTAATCCTGACCTTAACCTA TATGCTGATG GAAGTTCACT 1020 TGTGGAGAAT GGGATACGAA AAGCACATTATGCCATAGTT AGTGAGGTAA CAGTACTTGA 1080 AAGTAAGCCT ATTCCCCCAT GGACCAGAGCCCAGTTAGCA GAACTAGTGG CACTTACCCA 1140 AGCCTTAGAA CTAGGAAAGG GAAAAATAATAAATGTGTAT ACAGATAGCA AGTATGCTTA 1200 TCTAATCCTA CATGCCCATG CTGCAGTATGGAAAGAAAGG GAGTTCCTAA CCTCTGGGGG 1260 AACCCCCATT AAATACCACA AGGCAAATCATGGAGTTATT GCATGTAGTG CAAAACCTCA 1320 AGTAGGTGGC AGTTTTACAC TGCCTGAAGCTATGGGGAAG GAGAGAGGAG AACAGCAGCA 1380 TAAGTGGCTA GCAGAGGCAG CGAAAGACTAGCAGAGAGGA GAGGTAGGGG AAAGACAGAA 1440 AGTCAAAGAA AAGAAGTCAA AGACAGACAGAGAAAGAGAC AGAGGGAGCC AGAGAGAAAG 1500 AAAAGAGAGA ACGAAAGAGA CAGAATGTCAAAGAACAGAA GAGAGAGGCA GCGCCAGAAG 1560 AGTTAAGAAA GTGAGAAAGA GAGATGGAAATAGTAAAGAA AAAACAGTGT ACCCTATTCC 1620 TTTAAAAGCC AGGGTAAATT TAAAACGTATAATTTTATAA TTGGAAGGTC TTCTCCATAA 1680 CCCTATAACA TTAAAATACC ACCTTGTTGTCAGTGTAAAC AAGAGCATAG CCCAAAAGCA 1740 CTGAGGCCAC TGACAACCCA TAGCCTTCCTATCAAAAATC CTTAACTCTG CAGGTTTCCT 1800 AACAGGGGAT CTAAATCTCA ACTAATCACCATACAATGGT CCGACCAGAC CTAGGAGCGA 1860 CTCCCCTCAG GACAGAAGGA TGGATGGTTCCTCCCAGGCC ATTAAGGGAA AGAGACACAA 1920 TGGGTATTCA GTAAGTGATA AGGGAACTCTTGTAGAAGCA GTTAGGAAGA TTGCCTAATA 1980 TTTGGTCTGC TCAAATGTGC CAGCTGTTTGCACTCAGCTA AACCTTAAAT TACTTACAGA 2040 ATTAGGAAGG AGCCATCTAT ACCAATTCTGAGTTAATATG AGCTGAACAA GTTCTTATTA 2100 ATAGCAAAGA ATCATTGAAA TCTCAAACTTGCAAAGTTTT CAACAAAAGT AAAGTTTGCT 2160 GAAAGTTAGC AGTGTAACAT GTATTATCCTAACTTCTAAT CTTGTGGAAA TCAGACCCTA 2220 TCAGTGCCCC TCAAAGCTGA AGTCCATCAGCATATGGCCA TACAACTAAT ACCCCTATTT 2280 ATAGGGTTAG GAATGGCCAC TGCTACAGGAATGGGAGTAA CAGGTTTATC TACTTCATTA 2340 TCCTATTACC ACACACTCTT AAAGGATTTCTCAGACAGTT TACAAGAAAT AACAAAATCT 2400 ATCCTTACTC TNTARTCCCA AATAGRTTCTTTGGCAGCAG TGACTCTC 2448 21 base pairs nucleotide single linear cDNA 50CCTGAGTTCT TGCACTAACC C 21 23 base pairs nucleotide single linear cDNA51 GTCCGTTGGG TTTCCTTACT CCT 23 1196 base pairs nucleotide single linearcDNA 52 TTCCTGAGTT CTTGCACTAA CCTCAAATGA GAGAAGTGCC GCCATAACTGCAACCCAAGA 60 GTTTGGCGAT CCCTGGTATC TCAGTCAGGT CAATGACAGG ATGACAACAGAGGAAAGATA 120 ATGATTCCCC ACAGGCCAGC AGGCAGTTCC CAGTGTAGAC CCTCATTAGGACACAGAATC 180 AGAACATGGA GATTGGTGCC GCAGACATTT GCTAACTTGC GTGCTAGAAGGACTAAGGAA 240 AACTAGGAAG ATATGAATTA TTCAATGATG TCCACTATAA CACAGGGGAAAGGAAGAAAA 300 TCCTACTGCC TTTCTGGAGA GACTAAGGGA GGCATTGAGG AAGCATACCAGGCAAGTGGA 360 CATTGGAGGC TCTGGAAAAG GGAAAAGTTG GGAAAAGTAT ATGTCTAATAGGGCTTGCTT 420 CCAGTGTGGT CTACAAGGAC ACTTTAAAAA AGATTGTCCA ATAGAAATAAGCCACCACCT 480 CGTCCATGCC CCTTATGTCA AGGGAATCAC TGGAAGGCCC ACTGCCCCAGGGGATGAAGG 540 TCCTCTGAGT CAGAAGCCAC TAACCAGATG ATCCAGCAGC AGGACTGAGGGTGCCCGGGG 600 CAAGCGCCAG CCCATGCCAT CACCCTCACA GAGCCCCAGG TATGCTTGACCATTGAGGGT 660 CAGAAGGGTA CTGTCTCCTG GACACTGGCG GGCCTTCTCA GTCTTACTTTCCTGTCCTGG 720 ACAACTGTCC TCCAGATCTG TCACTGTCCG AGGGGTCCTA GGACAGCCAGTCACTAGATA 780 CTTCTCCCAG CCACTAAGTT GTGACTGGGG AACTTTACTC TTCCACATGCTTTTCTAATT 840 ATGCCTGAAA GCCCCACTCT CTTGTTAGGG GAGAGACATT CTAGCAAAAGCAGGGGCCAT 900 TATACATGTG AATATAGGAG AAGGAACAAC TGTTTGTTGT CCCCTGCTTGAGGAAGGAAT 960 TAATCCTGAA GTCCGGGCAA CAGAAGGACA ATATGGACAA GCAAAGAATGCCCGTCCTGT 1020 TCAAGTTAAA CTAAAGGATT CCACCTCCTT TCCCTACCAA AGGCAGTACCCCCTCAGACC 1080 CGAGACCCAA CAAGAACTCC AAAAGATTGT AAAGGACCTA AAAGCCCAAGGCCTAGTAAA 1140 ACCAAGCAAT AGCCCTTGCA AGACTCCAAT TTTAGGAGTA AGGAAACCCAACGGAC 1196 2391 base pairs nucleotide single linear cDNA 53 ATGATCCAGCAGCAGGACNG AGGGTGCCCG GGGCAAGCGC CAGCCCATGC CATCACCCTC 60 ACAGAGCCCCAGGTATGCTT GACCATTGAG GGTCAGAAGG GTNACTGTCT CCTGGACACT 120 GGCGGNGCCTTCTCAGTCTT ACTTTCCTGT CCTGGACAAC TGTCCTCCAG ATCTGTCACT 180 GTCCGAGGGGTCCTAGGACA GCCAGTCACT AGATACTTCT CCCAGCCACT AAGTTGTGAC 240 TGGGGAACTTTACTCTTCCC ACATGCTTTT CTAATTATGC CTGAAAGCCC CACTCTCTTG 300 TTGGGGAGAGACATTCTAGC AAAAGCAGGG GCCATTATAC ATGTGAATAT AGGAGAAGGA 360 ACAACTGTTTGTTGTCCCCT GCTTGAGGAA GGAATTAATC CTGAAGTCCG GGCAACAGAA 420 GGACAATATGGACAAGCAAA GAATGCCCGT CCTGTTCAAG TTAAACTAAA GGATTCCACC 480 TCCTTTCCCTACCAAAGGCA GTACCCCCTC AGACCCGAGA CCCAACAAGA ACTCCAAAAG 540 ATTGTAAAGGACCTAAAAGC CCAAGGCCTA GTAAAACCAA GCAATAGCCC TTGCAAGACT 600 CCAATTTTAGGAGTAAGGAA ACCCAACGGA CAGTGGAGGT TAGTGCAAGA ACTCAGGATT 660 ATCAATGAGGCTGTTGTTCC TCTATACCCA GCTGTACCTA ACCCTTATAC AGTGCTTTCC 720 CAAATACCAGAGGAAGCAGA GTGGTTTACA GTCCTGGACC TTAAGGATGC CTTTTTCTGC 780 ATCCCTGTACGTCCTGACTC TCAATTCTTG TTTGCCTTTG AAGATCCTTT GAACCCAACG 840 TCTCAACTCACCTGGACTGT TTTACCCCAA GGGTTCAGGG ATAGCCCCCA TCTATTTGGC 900 CAGGCATTAGCCCAAGACTT GAGTCAATTC TCATACCTGG ACACTCTTGT CCTTCAGTAC 960 ATGGATGATTTACTTTTAGT CGCCCGTTCA GAAACCTTGT GCCATCAAGC CACCCAAGAA 1020 CTCTTAACTTTCCTCACTAC CTGTGGCTAC AAGGTTTCCA AACCAAAGGC TCGGCTCTGC 1080 TCACAGGAGATTAGATACTN AGGGCTAAAA TTATCCAAAG GCACCAGGGC CCTCAGTGAG 1140 GAACGTATCCAGCCTATACT GGCTTATCCT CATCCCAAAA CCCTAAAGCA ACTAAGAGGG 1200 TTCCTTGGCATAACAGGTTT CTGCCGAAAA CAGATTCCCA GGTACASCCC AATAGCCAGA 1260 CCATTATATACACTAATTAN GGAAACTCAG AAAGCCAATA CCTATTTAGT AAGATGGACA 1320 CCTACAGAAGTGGCTTTCCA GGCCCTAAAG AAGGCCCTAA CCCAAGCCCC AGTGTTCAGC 1380 TTGCCAACAGGGCAAGATTT TTCTTTATAT GCCACAGAAA AAACAGGAAT AGCTCTAGGA 1440 GTCCTTACGCAGGTCTCAGG GATGAGCTTG CAACCCGTGG TATACCTGAG TAAGGAAATT 1500 GATGTAGTGGCAAAGGGTTG GCCTCATNGT TTATGGGTAA TGGNGGCAGT AGCAGTCTNA 1560 GTATCTGAAGCAGTTAAAAT AATACAGGGA AGAGATCTTN CTGTGTGGAC ATCTCATGAT 1620 GTGAACGGCATACTCACTGC TAAAGGAGAC TTGTGGTTGT CAGACAACCA TTTACTTAAN 1680 TATCAGGCTCTATTACTTGA AGAGCCAGTG CTGNGACTGC GCACTTGTGC AACTCTTAAA 1740 CCCAAACTTATGCTGCCCAG AAGGATCTTT NTAGAGGTCC CCTTAGCCAA CCCTGACCTC 1800 AACTATATATATACTGATGG AAGTTCGTTT GTAGAAAAGG GATTACAAAG GGNAGGATAT 1860 NCCATAGGTGTTAGTGATAA AGCAGTACTT GAAAGTAAGC CTCTTCCCCC CCAGGGACCA 1920 GCGCCCCCGTTAGCAGAACT AGTGGCACTG ACCCCGCGAG CCTTAGAACT TTGGAAAGGG 1980 AGGAGGATAAATGTGTATAC AGATAGCAAG TATGCTTATC TAATCCGAAA TGCCCATGTT 2040 GTTTATCTAATCCGAAATGC CCATGTTGCA ATATGGAAAG AAAGGGAGTT CCTAACCTCT 2100 GGGGGAACCCCCATTAAATA CCACAAGTTA ATCATGGAGT TATTGCACAC AGTGCAAAAA 2160 CTCAAGGAGGTGGAAGTCTT ACACTGCCAA AGCCATCAGA AAAGGGAAAG GGGAGAAGAG 2220 CAGCATAAGTGGCTACAGAG GCAAGGAAAG ACTAGCAGAA AGGAAAGAGA GAAAGAGACA 2280 GAAAGTCAGAGAGAGAGAGA GGAAGAGACA GAGCACAAAG AGGGAGTCAG AGAGAGAGAG 2340 AGACAGAGAGTCAGAGAGAA GGAAAGAGAG AGAGGAAGAG ACAAAGAATG A 2391 1722 base pairsnucleotide single linear cDNA 54 TGGAGAATAG CAGCATAAGT TGGCTGGCAGAAGTAGGGAA AGACAGCAAG AAGTAAAGAA 60 AAAAARGAGA AAGTCAGAGA AAGAAAAAAAGAGAGGAAGA AACAAAGAAG AACTTGAAGA 120 GAGAAAGAAG TAGTAAAGAA AAAACAGTATACCCTATTCC TTTAAAAGCC AGGGTAAATT 180 TCTGTCTACC TAGCCAAGGC ATATTCTTCTTATGTGGAAC ATCAACCTAT ATCTGCCTCC 240 CCACTAACTG GACAGGCACC TGAACCTTAGTCTTTCTAAG TCCCAACATT AACATTGCCC 300 CAGGAAATCA GACCCTATTG GTACCTGTCAAAGCTAAAGT CCCGTCAGTG CAGAGCCATA 360 CAACTAATAT CCCTATTTAT AGGGTTAGGAATGGCTACTG CTACAGGAAC TGGAATAGCC 420 GGTTTATCTA CTTCATTATC CTACTACCATACACTCTCAA AGAATTTCTC AGACAGTTTG 480 CAAGAAATAA TGAAATCTAT TCTTACTTTACAATCCCAAT TAGACTCTTT GGCAGCAATG 540 ACTCTCCAAA ACCGCCGAGG CCCACACCTCCTCACTGCTG AGAAAGGAGG ACTCTGCACC 600 TTCTTAGGGG AAGAGTGTTG TTTTTACACTAACCAGTCAG GGATAGTACG AGATGCCACC 660 TGGCATTTAC AGGAAAGGGC TTCTGATATCAGACAATGCC TTTCAAACTC TTATACCAAC 720 CTCTGGAGTT GGGCAACATG GCTTCTTCCATTTCTAGGTC CCATGGCAGC CATCTTGCTG 780 TTACTCACCT TTGGGCCCTG TATTTTTAAGCTTCTTGTCA AATTTGTTTC CTCTAGGATC 840 GAAGCCATCA AGCTACAGAT GGTCTTACAAATGGAACCCC AAATGAGTTC AACTAACAAC 900 TTCTACCAAG GACCCCTGGA ACGATCCACTGGCACTTCCA CTAGCCTAGA GATTCCCCTC 960 TGGAAGACAC TACAACTGCA GGGCCCCTTCTTTGCCCCTA TCCAGCAGGA AGTAGCTAGA 1020 GCGGTCATCG GCCAAATTCC CAACAGCAGTTGGGGTGTCC TGTTTAGAGG GGGGATTGAA 1080 GAGGTGACAG CCTGCTGGCA GCCTCACAGCCCTCGTTGGY TCTCAGTGCC TCCTCAGCCT 1140 TGGTGCCCAC TCTGGCCGTG CTTGAGGAGCCCTTCAGCCT GCCACTGCAC TGTGGGAGCC 1200 TCTTTCTGGG CTGGACAAGG CCGGAGCCAGCTCCCTCAGC TTGCAGGGAG GTATGGAGGG 1260 AGAGATGCAG GCGGGAACCA GGGCTGCGCATGGCGCTTGC GGGCCAGCAT GAGTTCCAGG 1320 TGGGCGTGGG CTCGGCGGGC CCCACACTCGGGCAGTGAGG GGCTTAGCAC CTGGGCCAGA 1380 CAGATGCTGT GCTCAACTTC TTCGCTGGGCCTTAGCTGCC TTCCCCGTGG GGCAGGGCTY 1440 CGGGAACMTG CAGCCTGCCC ATGCTTGAGCCCCCCACCCC GCCGTGGGTT CYTGCACAGC 1500 CCAAGCTTCC CGGACAAGCA CCACCCCTTATCCACGGTGC CCAGTCCCAT CAACCACCCA 1560 AGGGTTGAGG AGTGCGGGCA CACAGCGCGGGATTGGCAGG CAGTTCCACT TGCGGCCTTG 1620 GTGCGGGATC CACTGCGTGA AGCCAGCTGGGCTCCTGAGT CTGGTGGGGA CTTGGAGAAT 1680 CTTTATGTCT AGCTAAGGGA TTGTAAATACACCAATCAGC AC 1722 495 base pairs nucleotide single linear cDNA 55CTTCCCCAAC TAATAAGGAC CCCCCTTTCA ACCCAAACAG TCCAAAAGGA CATAGACAAA 60GGAGTAAACA ATGAACCAAA GAGTGCCAAT ATTCCCTGGT TATGCACCCT CCAAGCGGTG 120GGAGAAGAAT TCGGCCCAGC CAGAGTGCAT GTACCTTTTT CTCTCTCACA CTTGAAGCAA 180ATTAAAATAG ACNTAGGTNA ATTNTCAGAT AGCCCTGATG GYTATATTGA TGTTTTACAA 240GGATTAGGAC AATCCTTTGA TCTGACATGG AGAGATATAA TATTACTGCT AAATCAGACG 300CTAACCTCAA ATGAGAGAAG TGCTGCCATA ACTGGAGCCC GAGAGTTTGG CAATCTCTGG 360TATCTCAGTC AGGTCAATGA TAGGATGACA ACGGAGGAAA GAGAACGATT CCCCACAGGG 420CAGCAGGCAG TTCCCAGTGT AGCTCCTCAT TGGGACACAG AATCAGAACA TGGAGATTGG 480TGCCGCAGAC ATTTA 495 2503 base pairs nucleotide single linear cDNA 56CCAAGAACCC ACCAATTCCG GANCACATTT TGGCGACCAC GAAGGGACTT TCGCATATCG 60CCAAGCGGTG AGACAATAGC CGAGCGGTGA GACCTTTCCC AATCGCCAAG CAGTGAGTAC 120CATCAGACCC CTTTCACTTG CTATTCTGTC CTATCTTTCT TTAGAATTCG GGGGCTAAAT 180ACCGGGCATC TGTCAGCCAT TTAAAAGTGA CTAGCGGGCC GCCGGACTAA AGACACGGGT 240GTCAAGCTTT CTGGGAAAGG GCTCTCTAAC AACCCCCAAC TCTTTGGAGT TGGGACCGTT 300GGTTTGCCTA GAACCAGCTT CCGCTTTTCC TGTACTTCTG GGCTGAGCCG TGGGTTGACA 360GTGAAGGAAA GCCATGCATC TCCGGGGTCT CGMCAACATG TTGGTTGACC CTGCGGCCAT 420GAGTGGAACT CTCAAAAGCA TGTCGCCCAA GCGACACTCG CCTATCTATC CTATCTATCC 480TGACCCTTGC CCTCTGGGTC CTAATGCCTG CCAGACAAAC TTCCTCTCGC CTCTCTTCTC 540TGAAGCTAGA ACCGCTTCTA AAAATTGCTA CCTGGTCTCT GGTGCTTTTC CTARTTTCTC 600CTATAAAGAA TGAWTTCTAG TATTAAACTC CAGGACTCTG TTACCTTCTT TAGGCACCCG 660GGCTCACCAA TCAGAAAGAC ACAGTTTTTG CCCAAGGCCC CATCGTAGTG GGGACTACCT 720GGAATTTTAG GATCCCTCCT CAGACTAACA GGCCTAACAA AAGTTATTCC TGAAGCTAGG 780ATATGGGGAG CCTCAGAAAT TGTATCCCTC CTATTCATAT AAGTGAGAAC AAAAGGTGTC 840ACTCTTCCAA CCCTGAAGAT CCCCTCCCTC CCTCAGGGTA TGGCCCTCCA TTTCATTTTT 900GTGGCATAAC ATCTTTATAG GATGGGGTAA AGTCCCAATA CTAACAGGAG AATGCTTAGG 960ACTCTAACAG GTTTTTGAGA ATGCGTCAGT AAGGGCCACT AAATCTGATT TTTCTCAGTC 1020GGTCCTCCTT GTGGTCTAGG AGGACAGGCA AGGTTGTGCA GGTTTTCGAG AATGCGTCAG 1080TAAGGACCAC TAAATCCGAC CTTCCTCGGT CCTCCATGTG GTCTGGGAGG AAAACTAGTG 1140TTTCTGCTGC TGCGTCGGTG AGCGCAACTA TTCAAGTCAG CAGGGTCCAG GGACCGTTGC 1200AGGTTCTTGG GCAGGGGTTG TTTCTGCTGC TGCATTGGTG AATGCAACTA TTCTGATCAG 1260CAGGGTCCCA GGACCATTGC AGGTCCTTGG GCAGGGAGAG AAACAAAACA AACCAAAACT 1320GTGGGCGGTT TTGTCTTTCA TATGGGAAAC ACTCAGGCAT CAACAGGTTC ACCCTTGAAA 1380TGCATCCTAA GCCATTGGGA CCAATTTGAC CCACAAACCC TGAAAAAGAG GAGGCTCATT 1440TTTTCCTGCA CTACGGCTTG GCCCCAATAT TCTCTTTYTG ATGGGGAAAA ATGGCCACCT 1500GAGGGAAGCA CAAATTACAA TAYTATCCTA CAGCYTGATC TTTTCTGTAA GAGGGAAGGC 1560AAATGGAGTG AATACCTTAT GTCCAAGCTT TCTTTTCATT GAGGGAGAAT ACACAACTAT 1620GCAAAGCTTG CAATTTACAT CCCACAGGAG GACCCTTCAG CTTACCCCCA TATCCTAGCC 1680TCCCTATAGC TTCCCTTCCT ATTGATGATA CTCCTCCTCT AATCTCCCCT GCCCAGAAGG 1740AAATAAGCAA AGAAATCTCC AAAGGTCCAC AAAAACCCCC GGGCTATCGG TTATGTCCCT 1800TCAAGYTGTA GGGGGAGGGG AATTTGGCCC AACCCGGGTG CATGTCCCTT CTCCCTCTCT 1860GATTTAAAGC AGATCAAGGC AGACCTGGGG AAGTTTTCAG ATGATCCTGA TAGGTACATA 1920GATGTCCTAC AGGGTCTAGG GCAAACCTTT GACCTCACTT GGAGAGACGT CATGCTACTG 1980TTAGATCAAA CCCTGGCCTT TAATGAAAAG AATGCGGCTT TAGCTGCAGC CTGAGAGTTT 2040GGAGATACCT GGTATCCTAG TCAAGTAAAT GAAAGAATGA CAGCCGAAGA AAGGGACAAC 2100TTCCTTACTG GTCAGCAACC CATCCCCAGT ATGGATCCCC ACTGGGACTT TGACTCAGAT 2160CATGGGGACT GGAGTCGTAA ACATCTGTTG ATCTGTGTTC TGGAAGGACT AAGGAGAATT 2220GGGAAAAAGC CCATGAATTA TTCAATGATA TCCACCATAA CCCAGGGAAA GGAAGAAAAT 2280CCTTCTGCCT TCCTCGAGCG GCTACAAGAG GCCTTAAGAA AATATACTCC CCTGTCACCC 2340GAATCACTCG AGGGTCAATT GATTCTAAAA GATAAGTTTA TTACCCAATC AGCCACAGAT 2400ATCAGGAGAA AGCTCCAAAA GCAAGCCCTG AGCCTGAACA AAATCTAGAG ACATTATTAA 2460ACCTGGCAAC CTTGGTGTTC TATAATAGGG ACCAAGAGGA ACA 2503 1167 base pairsnucleotide single linear cDNA 57 AAGGAAACTC AGAAAGCCAA TACCCATTTAGTAAGATGGA CACCAGAAGC AGAAGCAGCT 60 TTCCAGGCCC TAAAGAAATC CCTAACCCAAGCCCCAGTGT TAAGCTTGCC AACGGGGCAA 120 GACTTTTCTT TATATGTCAC AGAAAAACAGGAATAGCTCT AGGAGTCCTT ACACAGGTCC 180 AAGGGACAAG CTTGCAACCT GTGGCATACCTGAGTAAGGA AACTGATGTA NTGGCAAAGG 240 GTTGGCCTCA TTGTTTACAG GTAGGGCAGCAGTAGCAGTC TTAGTTTCTG AAACAGTTAA 300 AATAATACAG GGAAGAGATC TTACTGTGTGGACATCTCAT GATGTGAACG GCATACTCAC 360 TGCTAAAGAG GACTTGTGGC TGTCAGACAACCATTTACTT AAATAGCAGG TTCTATTACT 420 TGAAGTGCCA GTGCTGCGAC TGCACATTTGTGCAACTCTT AACCCAGCCA CATTTCTTCC 480 AGACAATGAA GAAAAGATAG AACATAACTGTCAACAAGTA ATTGCTCAAA CCTATGCTGC 540 TCGAGGGGAC CTTCTAGAGG TTCCCTTGACTGATCCCGAC CTCAACTTGT ATACTGATGG 600 AAGTTCCTTG GCAGAAAAAG GACTTTGAAAAGCGGGGTAT GCAGTGATCA GTGATAATGG 660 AATACTTGAA AGTAATCGCC TCACTCCAGGAACTAGTGCT CACCTGGCAG AACTAATAGC 720 CCTCACTTGG GCACTAGAAT TAGGAGAAGGAAAAAGGGTA AATATATATT CAGACTCTAA 780 GTATGCTTAC CTAGTCCTCC ATGCCCATGCAGCAATATGG AGAGAGAGGG AATTCCTAAC 840 TTCTGAGGGA ACACCTATCA ACCATCAGGGAAGCCATTAG GAGATTATTA TTGGCTGTAC 900 AGAAACCTAA AGAGGTGGCA GTCTTACACTGCCAGGGTCA TCAGGAAGAA GAGGAAAGGG 960 AAATAGAAGG CAATCGCCAA GCGGATATTGAAGCAAAAAA AGCCGCAAGG CAGGACTCTC 1020 CATTAGAAAT GCTTATAGAA GGACCCCTAGTATGGGGTAA TCCCCTCTGG GAAACCAAGC 1080 CCCAGTACTC AGCAGGAAAA ATAGAATAGGAAACCTCACA AGGACATACT TTCCTCCCCT 1140 CCAGATGGCT AGCCACTGAG GAAGGAA 116778 base pairs nucleotide single linear cDNA 58 TCCAAAGGCA CCAGGGCCCTCAGTGAGGAA CGTATCCAGC CTATACTGGC TTATCCTCAT 60 CCCAAAACCC TAAAGCAA 78 26amino acids amino acid Not Relevant linear peptide 59 Ser Lys Gly ThrArg Ala Leu Ser Glu Glu Arg Ile Gln Pro Ile Leu 1 5 10 15 Ala Tyr ProHis Pro Lys Thr Leu Lys Gln 20 25 28 base pairs nucleotide single linearcDNA 60 AAATGTCTGC GGCACCAATC TCCATGTT 28 30 base pairs nucleotidesingle linear cDNA 61 AAGGGGCATG GACGAGGTGG TGGCTTATTT 30 21 base pairsnucleotide single linear cDNA 62 GGAGAAGAGC AGCATAAGTG G 21 25 basepairs nucleotide single linear cDNA 63 GTGCTGATTG GTGTATTTAC AATCC 25 34base pairs nucleotide single linear cDNA 64 GACTCGCTGC AGATCGATTTTTTTTTTTTT TTTT 34 30 base pairs nucleotide single linear cDNA 65GCCATCAAGC CACCCAAGAA CTCTTAACTT 30 30 base pairs nucleotide singlelinear cDNA 66 CCAATAGCCA GACCATTATA TACACTAATT 30 23 base pairsnucleotide single linear cDNA 67 GCCATAACTG CAACCCAAGA GTT 23 23 basepairs nucleotide single linear cDNA 68 GGACGAGGTG GTGGCTTATT TCT 23 25base pairs nucleotide single linear cDNA 69 AACTTGCGTG CTAGAAGGAC TAAGG25 24 base pairs nucleotide single linear cDNA 70 AACTTTTCCC TTTTCCAGATCCTC 24 22 base pairs nucleotide single linear cDNA 71 GCATACCAGGCAAGTGGACA TT 22 25 base pairs nucleotide single linear cDNA 72CTGTCCGTTG GGTTTCCTTA CTCCT 25 24 base pairs nucleotide single linearcDNA 73 GAGGCTCTGG AAAAGGGAAA AGTT 24 25 base pairs nucleotide singlelinear cDNA 74 AGGAGTAAGG AAACCCAACG GACAG 25 25 base pairs nucleotidesingle linear cDNA 75 TGTATATAAT GGTCTGGCTA TTGGG 25 26 base pairsnucleotide single linear cDNA 76 TTCGGCAGAA ACCTGTTATG CCAAGG 26 22 basepairs nucleotide single linear cDNA 77 CTCGATTTCT TGCTGGGCCT TA 22 20base pairs nucleotide single linear cDNA 78 GTTGATTCCC TCCTCAAGCA 20 20base pairs nucleotide single linear cDNA 79 CTCTACCAAT CAGCATGTGG 20 19base pairs nucleotide single linear cDNA 80 TGTTCCTCTT GGTCCCTAT 19 433amino acids amino acid Not Relevant linear peptide 81 Met Ala Thr AlaThr Gly Thr Gly Ile Ala Gly Leu Ser Thr Ser Leu 1 5 10 15 Ser Tyr TyrHis Thr Leu Ser Lys Asn Phe Ser Asp Ser Leu Gln Glu 20 25 30 Ile Met LysSer Ile Leu Thr Leu Gln Ser Gln Leu Asp Ser Leu Ala 35 40 45 Ala Met ThrLeu Gln Asn Arg Arg Gly Pro His Leu Leu Thr Ala Glu 50 55 60 Lys Gly GlyLeu Cys Thr Phe Leu Gly Glu Glu Cys Cys Phe Tyr Thr 65 70 75 80 Asn GlnSer Gly Ile Val Arg Asp Ala Thr Trp His Leu Gln Glu Arg 85 90 95 Ala SerAsp Ile Arg Gln Cys Leu Ser Asn Ser Tyr Thr Asn Leu Trp 100 105 110 SerTrp Ala Thr Trp Leu Leu Pro Phe Leu Gly Pro Met Ala Ala Ile 115 120 125Leu Leu Leu Leu Thr Phe Gly Pro Cys Ile Phe Lys Leu Leu Val Lys 130 135140 Phe Val Ser Ser Arg Ile Glu Ala Ile Lys Leu Gln Met Val Leu Gln 145150 155 160 Met Glu Pro Gln Met Ser Ser Thr Asn Asn Phe Tyr Gln Gly ProLeu 165 170 175 Glu Arg Ser Thr Gly Thr Ser Thr Ser Leu Glu Ile Pro LeuTrp Lys 180 185 190 Thr Leu Gln Leu Gln Gly Pro Phe Phe Ala Pro Ile GlnGln Glu Val 195 200 205 Ala Arg Ala Val Ile Gly Gln Ile Pro Asn Ser SerTrp Gly Val Leu 210 215 220 Phe Arg Gly Gly Ile Glu Glu Val Thr Ala CysTrp Gln Pro His Ser 225 230 235 240 Pro Arg Trp Xaa Ser Val Pro Pro GlnPro Trp Cys Pro Leu Trp Pro 245 250 255 Cys Leu Arg Ser Pro Ser Ala CysHis Cys Thr Val Gly Ala Ser Phe 260 265 270 Trp Ala Gly Gln Gly Arg SerGln Leu Pro Gln Leu Ala Gly Arg Tyr 275 280 285 Gly Gly Arg Asp Ala GlyGly Asn Gln Gly Cys Ala Trp Arg Leu Arg 290 295 300 Ala Ser Met Ser SerArg Trp Ala Trp Ala Arg Arg Ala Pro His Ser 305 310 315 320 Gly Ser GluGly Leu Ser Thr Trp Ala Arg Gln Met Leu Cys Ser Thr 325 330 335 Ser SerLeu Gly Leu Ser Cys Leu Pro Arg Gly Ala Gly Leu Arg Glu 340 345 350 XaaAla Ala Cys Pro Cys Leu Ser Pro Pro Pro Arg Arg Gly Phe Leu 355 360 365His Ser Pro Ser Phe Pro Asp Lys His His Pro Leu Ser Thr Val Pro 370 375380 Ser Pro Ile Asn His Pro Arg Val Glu Glu Cys Gly His Thr Ala Arg 385390 395 400 Asp Trp Gln Ala Val Pro Leu Ala Ala Leu Val Arg Asp Pro LeuArg 405 410 415 Glu Ala Ser Trp Ala Pro Glu Ser Gly Gly Asp Leu Glu AsnLeu Tyr 420 425 430 Val 693 base pairs nucleotide single linear cDNA 82CTTCCCCAAC TAATAAGGAC CCCCCTTTCA ACCCAAACAG TCCAAAAGGA CATAGACAAA 60GGAGTAAACA ATGAACCAAA GAGTGCCAAT ATTCCCTGGT TATGCACCCT CCAAGCGGTG 120GGAGAAGAAT TCGGCCCAGC CAGAGTGCAT GTACCTTTTT CTCTCTCACA CTTGAAGCAA 180ATTAAAATAG ACNTAGGTNA ATTNTCAGAT AGCCCTGATG GYTATATTGA TGTTTTACAA 240GGATTAGGAC AATCCTTTGA TCTGACATGG AGAGATATAA TATTACTGCT AAATCAGACG 300CTAACCTCAA ATGAGAGAAG TGCTGCCATA ACTGGAGCCC GAGAGTTTGG CAATCTCTGG 360TATCTCAGTC AGGTCAATGA TAGGATGACA ACGGAGGAAA GAGAACGATT CCCCACAGGG 420CAGCAGGCAG TTCCCAGTGT AGCTCCTCAT TGGGACACAG AATCAGAACA TGGAGATTGG 480TGCCGCAGAC ATTTACTAAC TTGCGTGCTA GAAGGACTAA GGAAAACTAG GAAGACTATG 540AATTATTCAA TGATGTCCAC TATAACACAG GGGAAAGGAA GAAAATCCTA CTGCCTTTCT 600GGAGAGACTA AGGGAGGCAT TGAGGAAGCA TACCAGGCAA GTGGACATTG GAGGCTCTGG 660AAAAGGGAAA AGTTGGGCAA ATTGAATGCC TAA 693 1577 base pairs nucleotidesingle linear cDNA 83 AACTTGCGTG CTAGAAGGAC TAAGGAAAAC TAGGAAGACTATGAATTATT CAATGATGTC 60 CACTATAACA CAGGGGAAAG GAAGAAAATC CTACTGCCTTTCTGGAGAGA CTAAGGGAGG 120 CATTGAGGAA GCATACCAGG CAAGTGGACA TTGGAGGCTCTGGAAAAGGG AAAAGTTGGG 180 CAAATTGAAT GCCTAATAGG GCTTGCTTCC AGTGCAGTCTACAAGGACGC TTTAGAAAAG 240 ATTGTCCAAG TAGAAATAAG CCGCCCCTCG TCCATGCCCCTTATGTCAAG GGAATCACTG 300 GAAGGCCTAC TGCCCCAGGG GACGAAGGTC CTCTGAGTCAGAAGCCACTA ACCTGATGAT 360 CCAGCAGCAG GACTGAGGGT GCCCGGGGCA AGTGCCAGCCCATGCCATCA CCCTCAGAGC 420 CCCGGGTATG TTTGACCATT GAGAGCCAGG AAGTTAACTGTCTCCTGGAC ACTGGCGCAG 480 CCTTCTCAGT CTTACTTTCC TGTCCCAGAC AATTGTCCTCCAGATCTGTC ACTATCCGAG 540 GGGTCCTAAG ACAGCCAGTC ACTACATACT TCTCTCAGCCACTAAGTTGT GACTGGGGAA 600 CTTTACTCTT TTCACATGCT TTTCTAATTA TGCCTGAAAGCCCCACTCCC TTGTTAGGGA 660 GAGACATTTT AGCAAAAGCA GGGGCCATTA TACACCTGAACATAGGAAAA GGAATACCCA 720 TTTGCTGTCC CCTGCTTGAG GAAGGAATTA ATCCTGAAGTCTGGGCAATA GAAGGACAAT 780 ATGGACAAGC AAAGAATGCC CGTCCTGTTC AAGTTAAACTAAAGGATTCT GCCTCCTTTC 840 CCTACCAAAG GAAGTACCCT CTTAGACCCG AGGCCCTACAAGGACTCAAA AGATTGTTAA 900 GGACCTAAAA GCCCAAGGCC TAGTAAAACC ATGCAGTAGCCCCTGCAATA CTCCAATTTT 960 AGGAGTAAGG AAACCCAACG GACAGTGGAG GTTAGTGCAAGATCTCAGGA TTATTAATGA 1020 GGCTGTTTTT CCTCTATACC CAGCTGTATC TAGCCCTTATACTCTGCTTT CCCTAATACC 1080 AGAGGAAGCA GAGTAGTTTA CAGTCCTGGA CCTTAAGGATGCCTCTTTCT GCATCCCTGT 1140 ACATCCTGAT TCTCAATTCT TGTTTGTCTT TGAAGATCCTTTGAACCCAA TGTCTCAATT 1200 CACCTGGACT GTTTTACCCC AGGGGTTCCG GGATAGCCCCCATCTATTTG GCCAGGCATT 1260 AGCCCAAGAC TTGAGCCAAT TCTCATACCT GGACATCTTGTCCTTCGGTA TGGGATGATT 1320 TAATTTTAGC CACCCGTTCA GAAACCTTGT GCCATCAAGCCACCCAAGCG TTCTTAAATT 1380 TCCTCACTCC GTGTGGCTAC AAGGTTTCCA AACCAAAGGCTCAGCTCTGC TCACAGCAGG 1440 TTAAATACTT AGGGTTAAAA TTATCCAAAG GCACCAGGGCCCTCTGTGAG GAATGTATCC 1500 AACCTGTACT GGCTTATCTT CATCCCAAAA CCCTAAAGCAACTAAGAAGG TCCTTGGCAT 1560 AACAGGTTTC TGCCGAA 1577 182 amino acids aminoacid Not Relevant linear peptide 84 Ser Ser Ser Arg Thr Glu Gly Ala ArgGly Lys Cys Gln Pro Met Pro 1 5 10 15 Ser Pro Ser Glu Pro Arg Val CysLeu Thr Ile Glu Ser Gln Glu Val 20 25 30 Asn Cys Leu Leu Asp Thr Gly AlaAla Phe Ser Val Leu Leu Ser Cys 35 40 45 Pro Arg Gln Leu Ser Ser Arg SerVal Thr Ile Arg Gly Val Leu Arg 50 55 60 Gln Pro Val Thr Thr Tyr Phe SerGln Pro Leu Ser Cys Asp Trp Gly 65 70 75 80 Thr Leu Leu Phe Ser His AlaPhe Leu Ile Met Pro Glu Ser Pro Thr 85 90 95 Pro Leu Leu Gly Arg Asp IleLeu Ala Lys Ala Gly Ala Ile Ile His 100 105 110 Leu Asn Ile Gly Lys GlyIle Pro Ile Cys Cys Pro Leu Leu Glu Glu 115 120 125 Gly Ile Asn Pro GluVal Trp Ala Ile Glu Gly Gln Tyr Gly Gln Ala 130 135 140 Lys Asn Ala ArgPro Val Gln Val Lys Leu Lys Asp Ser Ala Ser Phe 145 150 155 160 Pro TyrGln Arg Lys Tyr Pro Leu Arg Pro Glu Ala Leu Gln Gly Leu 165 170 175 LysArg Leu Leu Arg Thr 180 36 base pairs nucleotide single linear cDNA 85AGATCTGCAG AATTCGATAT CACCCCCCCC CCCCCC 36 22 base pairs nucleotidesingle linear cDNA 86 AGATCTGCAG AATTCGATAT CA 22 2304 base pairsnucleotide single linear 87 TCCAGCAGCA GGACTGAGGG TGCCCGGGGC AAGTGCCAGCCCATGCCATC ACCCTCAGAG 60 CCCCGGGTAT GTTTGACCAT TGAGAGCCAG GAAGTTAACTGTCTCCTGGA CACTGGCGCA 120 GCCTTCTCAG TCTTACTTTC CTGTCCCAGA CAATTGTCCTCCAGATCTGT CACTATCCGA 180 GGGGTCCTAG GACAGCCAGT CACTACATAC TTCTCTCAGCCACTAAGTTG TGACTGGGGA 240 ACTTTACTCT TTTCACATGC TTTTCTAATT ATGCCTGAAAGCCCCACTCC CTTGTTAGGG 300 AGAGACATTT TAGCAAAAGC AGGGGCCATT ATACACCTGAACATAGGAAA AGGAATACCC 360 ATTTGCTGTC CCCTGCTTGA GGAAGGAATT AATCCTGAAGTCTGGGCAAT AGAAGGACAA 420 TATGGACAAG CAAAGAATGC CCGTCCTGTT CAAGTTAAACTAAAGGATTC TGCCTCCTTT 480 CCCTACCAAA GGAAGTACCC TCTTAGACCC GAGGCCCTACAAGGANCTCA AAAGATTGTT 540 AAGGACCTAA AAGCCCAAGG CCTAGTAAAA CCATGCAGTAGCCCCTGCAA TACTCCAATT 600 TTAGGAGTAA GGAAACCCAA CGGACAGTGG AGGTTAGTGCAAGATCTCAG GATTATTAAT 660 GAGGCTGTTT TTCCTCTATA CCCAGCTGTA TCTAGCCCTTATACTCTGCT TTCCCTAATA 720 CCAGAGGAAG CAGAGTGGTT TACAGTCCTG GACCTTAAGGATGCCTTTTT CTGCATCCCT 780 GTACGTCCTG ACTCTCAATT CTTGTTTGCC TTTGAAGATCCTTTGAACCC AACGTCTCAA 840 CTCACCTGGA CTGTTTTACC CCAAGGGTTC AGGGATAGCCCCCATCTATT TGGCCAGGCA 900 TTAGCCCAAG ACTTGAGTCA ATTCTCATAC CTGGACACTCTTGTCCTTCA GTACGTGGAT 960 GATTTACTTT TAGTCGCCCG TTCAGAAACC TTGTGCCATCAAGCCACCCA AGAACTCTTA 1020 ACTTTCCTCA CTACCTGTGG CTACAAGGTT TCCAAACCAAAGGCTCGGCT CTGCTCACAG 1080 GAGATTAGAT ACTTAGGGCT AAAATTATCC AAAGGCACCAGGGCCCTCAG TGAGGAACGT 1140 ATCCAGCCTA TACTGGCTTA TCCTCATCCC AAAACCCTAAAGCAACTAAG AGGGTTCCTT 1200 GGCATAACAG GTTTCTGCCG AAAACAGATT CCCAGGTACACCCCAATAGC CAGACCATTA 1260 TATACACTAA TTAGGGAAAC TCAGAAAGCC AATACCTATTTAGTAAGATG GACACCTACA 1320 GAAGTGGCTT TCCAGGCCCT AAAGAAGGCC CTAACCCAAGCCCCAGTGTT CAGCTTGCCA 1380 ACAGGGCAAG ATTTTTCTTT ATATGCCACA GAAAAAACAGGAATAGCTCT AGGAGTCCTT 1440 ACGCAGGTCT CAGGGATGAG CTTGCAACCC GTGGTATACCTGAGTAAGGA AATTGATGTA 1500 GTGGCAAAGG GTTGGCCTCA TTGTTTATGG GTAATGGCGGCAGTAGCAGT CTTAGTATCT 1560 GAAGCAGTTA AAATAATACA GGGAAGAGAT CTTACTGTGTGGACATCTCA TGATGTGAAC 1620 GGCATACTCA CTGCTAAAGG AGACTTGTGG TTGTCAGACAACCATTTACT TAATTATCAG 1680 GCTCTATTAC TTGAAGAGCC AGTGCTGAGA CTGCGCACTTGTGCAACTCT TAAACCCGCC 1740 ACATTTCTTC CAGACAATGA AGAAAAGATA GAACATAACTGTCAACAAGT AATTGCTCAA 1800 ACCTATGCTG CTCGAGGGGA CCTTCTAGAG GTTCCCTTGACTGATCCCGA CCTCAACTTG 1860 TATACTGATG GAAGTTCCTT GGCAGAAAAA GGACTTCGAAAAGCGGGGTA TGCAGTGATC 1920 AGTGATAATG GAATACTTGA AAGTAATCGC CTCACTCCAGGAACTAGTGC TCACCTGGCA 1980 GAACTAATAG CCCTCACTTG GGCACTAGAA TTAGGAGAAGGAAAAAGGGT AAATATATAT 2040 TCAGACTCTA AGTATGCTTA CCTAGTCCTC CATGCCCATGCAGCAATATG GAGAGAGAGG 2100 GAATTCCTAA CTTCTGAGGG AACACCTATC AACCATCAGGAAGCCATTAG GAGATTATTA 2160 TTGGCTGTAC AGAAACCTAA AGAGGTGGCA GTCTTACACTGCCAGGGTCA TCAGGAAGAA 2220 GAGGAAAGGG AAATAGAAGG CAATCGCCAA GCGGATATTGAAGCAAAAAA AGCCGCAAGG 2280 CAGGACTCTC CATTAGAAAT GCTT 2304 2365 basepairs nucleotide single linear 88 ATGATCCAGC AGCAGGACNG AGGGTGCCCGGGGCAAGCGC CAGCCCATGC CATCACCCTC 60 ACAGAGCCCC AGGTATGCTT GACCATTGAGGGTCAGAAGG GTNACTGTCT CCTGGACACT 120 GGCGGNGCCT TCTCAGTCTT ACTTTCCTGTCCTGGACAAC TGTCCTCCAG ATCTGTCACT 180 GTCCGAGGGG TCCTAGGACA GCCAGTCACTAGATACTTCT CCCAGCCACT AAGTTGTGAC 240 TGGGGAACTT TACTCTTCCC ACATGCTTTTCTAATTATGC CTGAAAGCCC CACTCTCTTG 300 TTGGGGAGAG ACATTCTAGC AAAAGCAGGGGCCATTATAC ATGTGAATAT AGGAGAAGGA 360 ACAACTGTTT GTTGTCCCCT GCTTGAGGAAGGAATTAATC CTGAAGTCCG GGCAACAGAA 420 GGACAATATG GACAAGCAAA GAATGCCCGTCCTGTTCAAG TTAAACTAAA GGATTCCACC 480 TCCTTTCCCT ACCAAAGGCA GTACCCCCTCAGACCCGAGA CCCAACAAGA ACTCCAAAAG 540 ATTGTAAAGG ACCTAAAAGC CCAAGGCCTAGTAAAACCAA GCAATAGCCC TTGCAAGACT 600 CCAATTTTAG GAGTAAGGAA ACCCAACGGACAGTGGAGGT TAGTGCAAGA ACTCAGGATT 660 ATCAATGAGG CTGTTGTTCC TCTATACCCAGCTGTACCTA ACCCTTATAC AGTGCTTTCC 720 CAAATACCAG AGGAAGCAGA GTGGTTTACAGTCCTGGACC TTAAGGATGC CTTTTTCTGC 780 ATCCCTGTAC GTCCTGACTC TCAATTCTTGTTTGCCTTTG AAGATCCTTT GAACCCAACG 840 TCTCAACTCA CCTGGACTGT TTTACCCCAAGGGTTCAGGG ATAGCCCCCA TCTATTTGGC 900 CAGGCATTAG CCCAAGACTT GAGTCAATTCTCATACCTGG ACACTCTTGT CCTTCAGTAC 960 ATGGATGATT TACTTTTAGT CGCCCGTTCAGAAACCTTGT GCCATCAAGC CACCCAAGAA 1020 CTCTTAACTT TCCTCACTAC CTGTGGCTACAAGGTTTCCA AACCAAAGGC TCGGCTCTGC 1080 TCACAGGAGA TTAGATACTN AGGGCTAAAATTATCCAAAG GCACCAGGGC CCTCAGTGAG 1140 GAACGTATCC AGCCTATACT GGCTTATCCTCATCCCAAAA CCCTAAAGCA ACTAAGAGGG 1200 TTCCTTGGCA TAACAGGTTT CTGCCGAAAACAGATTCCCA GGTACASCCC AATAGCCAGA 1260 CCATTATATA CACTAATTAN GGAAACTCAGAAAGCCAATA CCTATTTAGT AAGATGGACA 1320 CCTACAGAAG TGGCTTTCCA GGCCCTAAAGAAGGCCCTAA CCCAAGCCCC AGTGTTCAGC 1380 TTGCCAACAG GGCAAGATTT TTCTTTATATGCCACAGAAA AAACAGGAAT AGCTCTAGGA 1440 GTCCTTACGC AGGTCTCAGG GATGAGCTTGCAACCCGTGG TATACCTGAG TAAGGAAATT 1500 GATGTAGTGG CAAAGGGTTG GCCTCATNGTTTATGGGTAA TGGNGGCAGT AGCAGTCTNA 1560 GTATCTGAAG CAGTTAAAAT AATACAGGGAAGAGATCTTN CTGTGTGGAC ATCTCATGAT 1620 GTGAACGGCA TACTSRCTGC TAAAGGAGACTTGTGGTTGT CAGACAACCA TTTACTTAAN 1680 TAYCAGGCYY TATTACTTGA AGAGCCAGTGCTGNGACTGC GCACTTGTCC AACTCTTAAA 1740 CCCAAACTTA TGCTGCCCAG AAGGATCTTTNTAGAGGTCC CCTTAGCCAA CCCTGACCTC 1800 AACTATATAT ATACTGATGG AAGTTCGTTTGTAGAAAAGG GATTACAAAG GGNAGGATAT 1860 NCCATAGGTG TTAGTGATAA AGCAGTACTTGAAAGTAAGC CTCTTCCCCC CCAGGGACCA 1920 GCGCCCCCGT TAGCAGAACT AGTGGCACTGACCCCGCGAG CCTTAGAACT TTGGAAAGGG 1980 AGGAGGATAA ATGTGTATAC AGATAGCAAGTATGCTTATC TAATCCGAAA TGCCCATGTT 2040 GCAATATGGA AAGAAAGGGA GTTCCTAACCTCTGGGGGAA CCCCCATTAA ATACCACAAG 2100 TTAATCATGG AGTTATTGCA CACAGTGCAAAAACTCAAGG AGGTGGAAGT CTTACACTGC 2160 CAAAGCCATC AGAAAAGGGA AAGAGGGGAAGAGCAGCATA AGTGGCTACA GAGGCAAGGA 2220 AAGACTAGCA GAAAGGAAAG AGAGAAAGAGACAGAAAGTC AGAGAGAGAG AGAGGAAGAG 2280 ACAGAGCACA AAGAGGGAGT CAGAGAGAGAGAGAGACAGA GAGTCAGAGA GAAGGAAAGA 2340 GAGAGAGGAA GAGACAAAGA ATGAH 2365768 amino acids amino acid Not Relevant linear peptide 89 Ser Ser SerArg Thr Glu Gly Ala Arg Gly Lys Cys Gln Pro Met Pro 1 5 10 15 Ser ProSer Glu Pro Arg Val Cys Leu Thr Ile Glu Ser Gln Glu Val 20 25 30 Asn CysLeu Leu Asp Thr Gly Ala Ala Phe Ser Val Leu Leu Ser Cys 35 40 45 Pro ArgGln Leu Ser Ser Arg Ser Val Thr Ile Arg Gly Val Leu Gly 50 55 60 Gln ProVal Thr Thr Tyr Phe Ser Gln Pro Leu Ser Cys Asp Trp Gly 65 70 75 80 ThrLeu Leu Phe Ser His Ala Phe Leu Ile Met Pro Glu Ser Pro Thr 85 90 95 ProLeu Leu Gly Arg Asp Ile Leu Ala Lys Ala Gly Ala Ile Ile His 100 105 110Leu Asn Ile Gly Lys Gly Ile Pro Ile Cys Cys Pro Leu Leu Glu Glu 115 120125 Gly Ile Asn Pro Glu Val Trp Ala Ile Glu Gly Gln Tyr Gly Gln Ala 130135 140 Lys Asn Ala Arg Pro Val Gln Val Lys Leu Lys Asp Ser Ala Ser Phe145 150 155 160 Pro Tyr Gln Arg Lys Tyr Pro Leu Arg Pro Glu Ala Leu GlnGly Xaa 165 170 175 Gln Lys Ile Val Lys Asp Leu Lys Ala Gln Gly Leu ValLys Pro Cys 180 185 190 Ser Ser Pro Cys Asn Thr Pro Ile Leu Gly Val ArgLys Pro Asn Gly 195 200 205 Gln Trp Arg Leu Val Gln Asp Leu Arg Ile IleAsn Glu Ala Val Phe 210 215 220 Pro Leu Tyr Pro Ala Val Ser Ser Pro TyrThr Leu Leu Ser Leu Ile 225 230 235 240 Pro Glu Glu Ala Glu Trp Phe ThrVal Leu Asp Leu Lys Asp Ala Phe 245 250 255 Phe Cys Ile Pro Val Arg ProAsp Ser Gln Phe Leu Phe Ala Phe Glu 260 265 270 Asp Pro Leu Asn Pro ThrSer Gln Leu Thr Trp Thr Val Leu Pro Gln 275 280 285 Gly Phe Arg Asp SerPro His Leu Phe Gly Gln Ala Leu Ala Gln Asp 290 295 300 Leu Ser Gln PheSer Tyr Leu Asp Thr Leu Val Leu Gln Tyr Val Asp 305 310 315 320 Asp LeuLeu Leu Val Ala Arg Ser Glu Thr Leu Cys His Gln Ala Thr 325 330 335 GlnGlu Leu Leu Thr Phe Leu Thr Thr Cys Gly Tyr Lys Val Ser Lys 340 345 350Pro Lys Ala Arg Leu Cys Ser Gln Glu Ile Arg Tyr Leu Gly Leu Lys 355 360365 Leu Ser Lys Gly Thr Arg Ala Leu Ser Glu Glu Arg Ile Gln Pro Ile 370375 380 Leu Ala Tyr Pro His Pro Lys Thr Leu Lys Gln Leu Arg Gly Phe Leu385 390 395 400 Gly Ile Thr Gly Phe Cys Arg Lys Gln Ile Pro Arg Tyr ThrPro Ile 405 410 415 Ala Arg Pro Leu Tyr Thr Leu Ile Arg Glu Thr Gln LysAla Asn Thr 420 425 430 Tyr Leu Val Arg Trp Thr Pro Thr Glu Val Ala PheGln Ala Leu Lys 435 440 445 Lys Ala Leu Thr Gln Ala Pro Val Phe Ser LeuPro Thr Gly Gln Asp 450 455 460 Phe Ser Leu Tyr Ala Thr Glu Lys Thr GlyIle Ala Leu Gly Val Leu 465 470 475 480 Thr Gln Val Ser Gly Met Ser LeuGln Pro Val Val Tyr Leu Ser Lys 485 490 495 Glu Ile Asp Val Val Ala LysGly Trp Pro His Cys Leu Trp Val Met 500 505 510 Ala Ala Val Ala Val LeuVal Ser Glu Ala Val Lys Ile Ile Gln Gly 515 520 525 Arg Asp Leu Thr ValTrp Thr Ser His Asp Val Asn Gly Ile Leu Thr 530 535 540 Ala Lys Gly AspLeu Trp Leu Ser Asp Asn His Leu Leu Asn Tyr Gln 545 550 555 560 Ala LeuLeu Leu Glu Glu Pro Val Leu Arg Leu Arg Thr Cys Ala Thr 565 570 575 LeuLys Pro Ala Thr Phe Leu Pro Asp Asn Glu Glu Lys Ile Glu His 580 585 590Asn Cys Gln Gln Val Ile Ala Gln Thr Tyr Ala Ala Arg Gly Asp Leu 595 600605 Leu Glu Val Pro Leu Thr Asp Pro Asp Leu Asn Leu Tyr Thr Asp Gly 610615 620 Ser Ser Leu Ala Glu Lys Gly Leu Arg Lys Ala Gly Tyr Ala Val Ile625 630 635 640 Ser Asp Asn Gly Ile Leu Glu Ser Asn Arg Leu Thr Pro GlyThr Ser 645 650 655 Ala His Leu Ala Glu Leu Ile Ala Leu Thr Trp Ala LeuGlu Leu Gly 660 665 670 Glu Gly Lys Arg Val Asn Ile Tyr Ser Asp Ser LysTyr Ala Tyr Leu 675 680 685 Val Leu His Ala His Ala Ala Ile Trp Arg GluArg Glu Phe Leu Thr 690 695 700 Ser Glu Gly Thr Pro Ile Asn His Gln GluAla Ile Arg Arg Leu Leu 705 710 715 720 Leu Ala Val Gln Lys Pro Lys GluVal Ala Val Leu His Cys Gln Gly 725 730 735 His Gln Glu Glu Glu Glu ArgGlu Ile Glu Gly Asn Arg Gln Ala Asp 740 745 750 Ile Glu Ala Lys Lys AlaAla Arg Gln Asp Ser Pro Leu Glu Met Leu 755 760 765 114 amino acidsamino acid Not Relevant linear peptide 90 Ser Ser Ser Arg Thr Glu GlyAla Arg Gly Lys Cys Gln Pro Met Pro 1 5 10 15 Ser Pro Ser Glu Pro ArgVal Cys Leu Thr Ile Glu Ser Gln Glu Val 20 25 30 Asn Cys Leu Leu Asp ThrGly Ala Ala Phe Ser Val Leu Leu Ser Cys 35 40 45 Pro Arg Gln Leu Ser SerArg Ser Val Thr Ile Arg Gly Val Leu Gly 50 55 60 Gln Pro Val Thr Thr TyrPhe Ser Gln Pro Leu Ser Cys Asp Trp Gly 65 70 75 80 Thr Leu Leu Phe SerHis Ala Phe Leu Ile Met Pro Glu Ser Pro Thr 85 90 95 Pro Leu Leu Gly ArgAsp Ile Leu Ala Lys Ala Gly Ala Ile Ile His 100 105 110 Leu Asn 654amino acids amino acid Not Relevant linear peptide 91 Ile Gly Lys GlyIle Pro Ile Cys Cys Pro Leu Leu Glu Glu Gly Ile 1 5 10 15 Asn Pro GluVal Trp Ala Ile Glu Gly Gln Tyr Gly Gln Ala Lys Asn 20 25 30 Ala Arg ProVal Gln Val Lys Leu Lys Asp Ser Ala Ser Phe Pro Tyr 35 40 45 Gln Arg LysTyr Pro Leu Arg Pro Glu Ala Leu Gln Gly Xaa Gln Lys 50 55 60 Ile Val LysAsp Leu Lys Ala Gln Gly Leu Val Lys Pro Cys Ser Ser 65 70 75 80 Pro CysAsn Thr Pro Ile Leu Gly Val Arg Lys Pro Asn Gly Gln Trp 85 90 95 Arg LeuVal Gln Asp Leu Arg Ile Ile Asn Glu Ala Val Phe Pro Leu 100 105 110 TyrPro Ala Val Ser Ser Pro Tyr Thr Leu Leu Ser Leu Ile Pro Glu 115 120 125Glu Ala Glu Trp Phe Thr Val Leu Asp Leu Lys Asp Ala Phe Phe Cys 130 135140 Ile Pro Val Arg Pro Asp Ser Gln Phe Leu Phe Ala Phe Glu Asp Pro 145150 155 160 Leu Asn Pro Thr Ser Gln Leu Thr Trp Thr Val Leu Pro Gln GlyPhe 165 170 175 Arg Asp Ser Pro His Leu Phe Gly Gln Ala Leu Ala Gln AspLeu Ser 180 185 190 Gln Phe Ser Tyr Leu Asp Thr Leu Val Leu Gln Tyr ValAsp Asp Leu 195 200 205 Leu Leu Val Ala Arg Ser Glu Thr Leu Cys His GlnAla Thr Gln Glu 210 215 220 Leu Leu Thr Phe Leu Thr Thr Cys Gly Tyr LysVal Ser Lys Pro Lys 225 230 235 240 Ala Arg Leu Cys Ser Gln Glu Ile ArgTyr Leu Gly Leu Lys Leu Ser 245 250 255 Lys Gly Thr Arg Ala Leu Ser GluGlu Arg Ile Gln Pro Ile Leu Ala 260 265 270 Tyr Pro His Pro Lys Thr LeuLys Gln Leu Arg Gly Phe Leu Gly Ile 275 280 285 Thr Gly Phe Cys Arg LysGln Ile Pro Arg Tyr Thr Pro Ile Ala Arg 290 295 300 Pro Leu Tyr Thr LeuIle Arg Glu Thr Gln Lys Ala Asn Thr Tyr Leu 305 310 315 320 Val Arg TrpThr Pro Thr Glu Val Ala Phe Gln Ala Leu Lys Lys Ala 325 330 335 Leu ThrGln Ala Pro Val Phe Ser Leu Pro Thr Gly Gln Asp Phe Ser 340 345 350 LeuTyr Ala Thr Glu Lys Thr Gly Ile Ala Leu Gly Val Leu Thr Gln 355 360 365Val Ser Gly Met Ser Leu Gln Pro Val Val Tyr Leu Ser Lys Glu Ile 370 375380 Asp Val Val Ala Lys Gly Trp Pro His Cys Leu Trp Val Met Ala Ala 385390 395 400 Val Ala Val Leu Val Ser Glu Ala Val Lys Ile Ile Gln Gly ArgAsp 405 410 415 Leu Thr Val Trp Thr Ser His Asp Val Asn Gly Ile Leu ThrAla Lys 420 425 430 Gly Asp Leu Trp Leu Ser Asp Asn His Leu Leu Asn TyrGln Ala Leu 435 440 445 Leu Leu Glu Glu Pro Val Leu Arg Leu Arg Thr CysAla Thr Leu Lys 450 455 460 Pro Ala Thr Phe Leu Pro Asp Asn Glu Glu LysIle Glu His Asn Cys 465 470 475 480 Gln Gln Val Ile Ala Gln Thr Tyr AlaAla Arg Gly Asp Leu Leu Glu 485 490 495 Val Pro Leu Thr Asp Pro Asp LeuAsn Leu Tyr Thr Asp Gly Ser Ser 500 505 510 Leu Ala Glu Lys Gly Leu ArgLys Ala Gly Tyr Ala Val Ile Ser Asp 515 520 525 Asn Gly Ile Leu Glu SerAsn Arg Leu Thr Pro Gly Thr Ser Ala His 530 535 540 Leu Ala Glu Leu IleAla Leu Thr Trp Ala Leu Glu Leu Gly Glu Gly 545 550 555 560 Lys Arg ValAsn Ile Tyr Ser Asp Ser Lys Tyr Ala Tyr Leu Val Leu 565 570 575 His AlaHis Ala Ala Ile Trp Arg Glu Arg Glu Phe Leu Thr Ser Glu 580 585 590 GlyThr Pro Ile Asn His Gln Glu Ala Ile Arg Arg Leu Leu Leu Ala 595 600 605Val Gln Lys Pro Lys Glu Val Ala Val Leu His Cys Gln Gly His Gln 610 615620 Glu Glu Glu Glu Arg Glu Ile Glu Gly Asn Arg Gln Ala Asp Ile Glu 625630 635 640 Ala Lys Lys Ala Ala Arg Gln Asp Ser Pro Leu Glu Met Leu 645650 149 amino acids amino acid Not Relevant linear peptide 92 Leu TyrThr Asp Gly Ser Ser Leu Ala Glu Lys Gly Leu Arg Lys Ala 1 5 10 15 GlyTyr Ala Val Ile Ser Asp Asn Gly Ile Leu Glu Ser Asn Arg Leu 20 25 30 ThrPro Gly Thr Ser Ala His Leu Ala Glu Leu Ile Ala Leu Thr Trp 35 40 45 AlaLeu Glu Leu Gly Glu Gly Lys Arg Val Asn Ile Tyr Ser Asp Ser 50 55 60 LysTyr Ala Tyr Leu Val Leu His Ala His Ala Ala Ile Trp Arg Glu 65 70 75 80Arg Glu Phe Leu Thr Ser Glu Gly Thr Pro Ile Asn His Gln Glu Ala 85 90 95Ile Arg Arg Leu Leu Leu Ala Val Gln Lys Pro Lys Glu Val Ala Val 100 105110 Leu His Cys Gln Gly His Gln Glu Glu Glu Glu Arg Glu Ile Glu Gly 115120 125 Asn Arg Gln Ala Asp Ile Glu Ala Lys Lys Ala Ala Arg Gln Asp Ser130 135 140 Pro Leu Glu Met Leu 145 21 base pairs nucleotide singlelinear 93 TCCAGCAGCA GGACTGAGGG T 21 24 base pairs nucleotide singlelinear 94 GACAGCAAAT GGGTATTCCT TTCC 24 25 base pairs nucleotide singlelinear 95 AGGAGTAAGG AAACCCAACG GACAG 25 25 base pairs nucleotide singlelinear 96 TGTATATAAT GGTCTGGCTA TTGGG 25 26 base pairs nucleotide singlelinear 97 GGCTCTGCTC ACAGGAGATT AGATAC 26 26 base pairs nucleotidesingle linear 98 AAAGGCACCA GGGCCCTCAG TGAGGA 26 28 base pairsnucleotide single linear 99 GGTTTAAGAG TTGCACAAGT GCGCAGTC 28 310 basepairs nucleic acid single linear cDNA 100 GCTTATAGAA GGACCCCTAGTATGGGGTAA TCCCCTCTGG GAAACCAAGC CCCAGTACTC 60 AGCAGGAAAA ATAGAATAGGAAACCTCACA AGGACATACT TTCCTCCCCT CCAGATGGCT 120 AGCCACTGAG GAAGGAAAAATACTTTCACC TGCAGCTAAC CAACAGAAAT TACTTAAAAC 180 CCTTCACCAA ACCTTCCACTTAGGCATTGA TAGCACCCAT CAGATGGCCA AATTATTATT 240 TACTGGACCA GGCCTTTTCAAAACTATCAA GAAGATAGTC AGGGGCTGTG AAGTGTGCCA 300 AAGAAATAAT 310 103 aminoacids amino acid single linear peptide 101 Leu Ile Glu Gly Pro Leu ValTrp Gly Asn Pro Leu Trp Glu Thr Lys 1 5 10 15 Pro Gln Tyr Ser Ala GlyLys Ile Glu Xaa Glu Thr Ser Gln Gly His 20 25 30 Thr Phe Leu Pro Ser ArgTrp Leu Ala Thr Glu Glu Gly Lys Ile Leu 35 40 45 Ser Pro Ala Ala Asn GlnGln Lys Leu Leu Lys Thr Leu His Gln Thr 50 55 60 Phe His Leu Gly Ile AspSer Thr His Gln Met Ala Lys Leu Leu Phe 65 70 75 80 Thr Gly Pro Gly LeuPhe Lys Thr Ile Lys Lys Ile Val Arg Gly Cys 85 90 95 Glu Val Cys Gln ArgAsn Asn 100 635 base pairs nucleic acid single linear cDNA 102CCCTGTATCT TTAACCTCCT TGTTAAGTTT GTCTCTTCCA GAATCAAAAC TGTAAAACTA 60CAAATTGTTC TTCAAATGGA GCACCAGATG GAGTCCATGA CTAAGATCCA CCGTGGACCC 120CTGGACCGGC CTGCTAGCCC ATGCTCCGAT GTTAATGACA TTGAAGGCAC CCCTCCCGAG 180GAAATCTCAA CTGCACAACC CCTACTATGC CCCAATTCAG CGGGAAGCAG TTAGAGCGGT 240CATCAGCCAA CCTCCCCAAC AGCACTTGGG TTTTCCTGTT GAGAGGGGGG ACTGAGAGAC 300AGGACTAGCT GGATTTCCTA GGCCAACGAA GAATCCCTAA GCCTAGCTGG GAAGGTGACT 360GCATCCACCT CTAAACATGG GGCTTGCAAC TTAGCTCACA CCCGACCAAT CAGAGAGCTC 420ACTAAAATGC TAATTAGGCA AAAATAGGAG GTAAAGAAAT AGCCAATCAT CTATTGCCTG 480AGAGCACAGC GGGAGGGACA AGGATCGGGA TATAAACCCA GGCATTCGAG CCGGCAACGG 540CAACCCCCTT TGGGTCCCCT CCCTTTGTAT GGGCGCTCTG TTTTCACTCT ATTTCACTCT 600ATTAAATCTT GCAACTGAAA AAAAAAAAAA AAAAA 635 77 amino acids amino acidsingle linear peptide 103 Pro Cys Ile Phe Asn Leu Leu Val Lys Phe ValSer Ser Arg Ile Lys 1 5 10 15 Thr Val Lys Leu Gln Ile Val Leu Gln MetGlu His Gln Met Glu Ser 20 25 30 Met Thr Lys Ile His Arg Gly Pro Leu AspArg Pro Ala Ser Pro Cys 35 40 45 Ser Asp Val Asn Asp Ile Glu Gly Thr ProPro Glu Glu Ile Ser Thr 50 55 60 Ala Gln Pro Leu Leu Cys Pro Asn Ser AlaGly Ser Ser 65 70 75 32 base pairs nucleic acid single linear cDNA 104TGGGGTTCCA TTTGTAAGAC CATCTGTAGC TT 32 1481 base pairs nucleic acidsingle linear cDNA 105 ATGGCCCTCC CTTATCATAC TTTTCTCTTT ACTGTTCTCTTACCCCCTTT CGCTCTCACT 60 GCACCCCCTC CATGCTGCTG TACAACCAGT AGCTCCCCTTACCAAGAGTT TCTATGAAGA 120 ACGCGGCTTC CTGGAAATAT TGATGCCCCA TCATATAGGAGTTTATCTAA GGGAAACTCC 180 ACCTTCACTG CCCACACCCA TATGCCCCGC AACTGCTATAACTCTGCCAC TCTTTGCATG 240 CATGCAAATA CTCATTATTG GACAGGGAAA ATGATTAATCCTAGTTGTCC TGGAGGACTT 300 GGAGCCACTG TCTGTTGGAC TTACTTCACC CATACCAGTATGTCTGATGG GGGTGGAATT 360 CAAGGTCAGG CAAGAGAAAA ACAAGTAAAG GAAGCAATCTCCCAACTGAC CCGGGGACAT 420 AGCACCCCTA GCCCCTACAA AGGACTAGTT CTCTCAAAACTACATGAAAC CCTCCGTACC 480 CATACTCGCC TGGTGAGCCT ATTTAATACC ACCCTCACTCGGCTCCATGA GGTCTCAGCC 540 CAAAACCCTA CTAACTGTTG GATGTGCCTC CCCCTGCACTTCAGGCCATA CATTTCAATC 600 CCTGTTCCTG AACAATGGAA CAACTTCAGC ACAGAAATAAACACCACTTC CGTTTTAGTA 660 GGACCTCTTG TTTCCAATCT GGAAATAACC CATACCTCAAACCTCACCTG TGTAAAATTT 720 AGCAATACTA TAGACACAAC CAGCTCCCAA TGCATCAGGTGGGTAACACC TCCCACACGA 780 ATAGTCTGCC TACCCTCAGG AATATTTTTT GTCTGTGGTACCTCAGCCTA TCATTGTTTG 840 AATGGCTCTT CAGAATCTAT GTGCTTCCTC TCATTCTTAGTGCCCCCTAT GACCATCTAC 900 ACTGAACAAG ATTTATACAA TCATGTCGTA CCTAAGCCCCACAACAAAAG AGTACCCATT 960 CTTCCTTTTG TTATCAGAGC AGGAGTGCTA GGCAGACTAGGTACTGGCAT TGGCAGTATC 1020 ACAACCTCTA CTCAGTTCTA CTACAAACTA TCTCAAGAAATAAATGGTGA CATGGAACAG 1080 GTCACTGACT CCCTGGTCAC CTTGCAAGAT CAACTTAACTCCCTAGCAGC AGTAGTCCTT 1140 CAAAATCGAA GAGCTTTAGA CTTGCTAACC GCCAAAAGAGGGGGAACCTG TTTATTTTTA 1200 GGAGAAGAAC GCTGTTATTA TGTTAATCAA TCCAGAATTGTCACTGAGAA AGTTAAAGAA 1260 ATTCGAGATC GAATACAATG TAGAGCAGAG GAGCTTCAAAACACCGAACG CTGGGGCCTC 1320 CTCAGCCAAT GGATGCCCTG GGTTCTCCCC TTCTTAGGACCTCTAGCAGC TCTAATATTG 1380 TTACTCCTCT TTGGACCCTG TATCTTTAAC CTCCTTGTTAAGTTTGTCTC TTCCAGAATT 1440 GAAGCTGTAA AGCTACAGAT GGTCTTACAA ATGGAACCCC A1481 493 amino acids amino acid single linear peptide 106 Met Ala LeuPro Tyr His Thr Phe Leu Phe Thr Val Leu Leu Pro Pro 1 5 10 15 Phe AlaLeu Thr Ala Pro Pro Pro Cys Cys Cys Thr Thr Ser Ser Ser 20 25 30 Pro TyrGln Glu Phe Leu Xaa Arg Thr Arg Leu Pro Gly Asn Ile Asp 35 40 45 Ala ProSer Tyr Arg Ser Leu Ser Lys Gly Asn Ser Thr Phe Thr Ala 50 55 60 His ThrHis Met Pro Arg Asn Cys Tyr Asn Ser Ala Thr Leu Cys Met 65 70 75 80 HisAla Asn Thr His Tyr Trp Thr Gly Lys Met Ile Asn Pro Ser Cys 85 90 95 ProGly Gly Leu Gly Ala Thr Val Cys Trp Thr Tyr Phe Thr His Thr 100 105 110Ser Met Ser Asp Gly Gly Gly Ile Gln Gly Gln Ala Arg Glu Lys Gln 115 120125 Val Lys Glu Ala Ile Ser Gln Leu Thr Arg Gly His Ser Thr Pro Ser 130135 140 Pro Tyr Lys Gly Leu Val Leu Ser Lys Leu His Glu Thr Leu Arg Thr145 150 155 160 His Thr Arg Leu Val Ser Leu Phe Asn Thr Thr Leu Thr ArgLeu His 165 170 175 Glu Val Ser Ala Gln Asn Pro Thr Asn Cys Trp Met CysLeu Pro Leu 180 185 190 His Phe Arg Pro Tyr Ile Ser Ile Pro Val Pro GluGln Trp Asn Asn 195 200 205 Phe Ser Thr Glu Ile Asn Thr Thr Ser Val LeuVal Gly Pro Leu Val 210 215 220 Ser Asn Leu Glu Ile Thr His Thr Ser AsnLeu Thr Cys Val Lys Phe 225 230 235 240 Ser Asn Thr Ile Asp Thr Thr SerSer Gln Cys Ile Arg Trp Val Thr 245 250 255 Pro Pro Thr Arg Ile Val CysLeu Pro Ser Gly Ile Phe Phe Val Cys 260 265 270 Gly Thr Ser Ala Tyr HisCys Leu Asn Gly Ser Ser Glu Ser Met Cys 275 280 285 Phe Leu Ser Phe LeuVal Pro Pro Met Thr Ile Tyr Thr Glu Gln Asp 290 295 300 Leu Tyr Asn HisVal Val Pro Lys Pro His Asn Lys Arg Val Pro Ile 305 310 315 320 Leu ProPhe Val Ile Arg Ala Gly Val Leu Gly Arg Leu Gly Thr Gly 325 330 335 IleGly Ser Ile Thr Thr Ser Thr Gln Phe Tyr Tyr Lys Leu Ser Gln 340 345 350Glu Ile Asn Gly Asp Met Glu Gln Val Thr Asp Ser Leu Val Thr Leu 355 360365 Gln Asp Gln Leu Asn Ser Leu Ala Ala Val Val Leu Gln Asn Arg Arg 370375 380 Ala Leu Asp Leu Leu Thr Ala Lys Arg Gly Gly Thr Cys Leu Phe Leu385 390 395 400 Gly Glu Glu Arg Cys Tyr Tyr Val Asn Gln Ser Arg Ile ValThr Glu 405 410 415 Lys Val Lys Glu Ile Arg Asp Arg Ile Gln Cys Arg AlaGlu Glu Leu 420 425 430 Gln Asn Thr Glu Arg Trp Gly Leu Leu Ser Gln TrpMet Pro Trp Val 435 440 445 Leu Pro Phe Leu Gly Pro Leu Ala Ala Leu IleLeu Leu Leu Leu Phe 450 455 460 Gly Pro Cys Ile Phe Asn Leu Leu Val LysPhe Val Ser Ser Arg Ile 465 470 475 480 Glu Ala Val Lys Leu Gln Met ValLeu Gln Met Glu Pro 485 490 32 base pairs nucleic acid single linearcDNA 107 TCAAAATCGA AGAGCTTTAG ACTTGCTAAC CG 32 1329 base pairs nucleicacid single linear cDNA 108 TCAAAATCGA AGAGCTTTAG ACTTGCTAAC CGCCAAAAGAGGGGGAACCT GTTTATTTTT 60 AGGGGAAGAA TGCTGTTAGT ATGTTAATCA ATCTGGAATCATTACTGAGA AAGTTAAAGA 120 AATTTGAGAT CGAATATAAT GTAGAGCAGA GGACCTTCAAAACACTGCAC CCTGGGGCCT 180 CCTCAGCCAA TGGATGCCCT GGACTCTCCC CTTCTTAGGACCTCTAGCAG CTATAATATT 240 TTTACTCCTC TTTGGACCCT GTATCTTCAA CTTCCTTGTTAAGTTTGTCT CTTCCAGAAT 300 TGAAGCTGTA AAGCTACAAA TAGTTCTTCA AATGGAACCCCAGATGCAGT CCATGACTAA 360 AATCTACCGT GGACCCCTGG ACCGGCCTGC TAGACTATGCTCTGATGTTA ATGACATTGA 420 AGTCACCCCT CCCGAGGAAA TCTCAACTGC ACAACCCCTACTACACTCCA ATTCAGTAGG 480 AAGCAGTTAG AGCAGTTGTC AGCCAACCTC CCCAACAGTACTTGGGTTTT CCTGTTGAGA 540 GGGTGGACTG AGAGACAGGA CTAGCTGGAT TTCCTAGGCTGACTAAGAAT CCCNAAGCCT 600 ANCTGGGAAG GTGACCGCAT CCATCTTTAA ACATGGGGCTTGCAACTTAG CTCACACCCG 660 ACCAATCAGA GAGCTCACTA AAATGCTAAT CAGGCAAAAACAGGAGGTAA AGCAATAGCC 720 AATCATCTAT TGCCTGAGAG CACAGCGGGA AGGACAAGGATTGGGATATA AACTCAGGCA 780 TTCAAGCCAG CAACAGCAAC CCCCTTTGGG TCCCCTCCCATTGTATGGGA GCTCTGTTTT 840 CACTCTATTT CACTCTATTA AATCATGCAA CTGCACTCTTCTGGTCCGTG TTTTTTATGG 900 CTCAAGCTGA GCTTTTGTTC GCCATCCACC ACTGCTGTTTGCCACCGTCA CAGACCCGCT 960 GCTGACTTCC ATCCCTTTGG ATCCAGCAGA GTGTCCACTGTGCTCCTGAT CCAGCGAGGT 1020 ACCCATTGCC ACTCCCGATC AGGCTAAAGG CTTGCCATTGTTCCTGCATG GCTAAGTGCC 1080 TGGGTTTGTC CTAATAGAAC TGAACACTGG TCACTGGGTTCCATGGTTCT CTTCCATGAC 1140 CCACGGCTTC TAATAGAGCT ATAACACTCA CCGCATGGCCCAAGATTCCA TTCCTTGGTA 1200 TCTGTGAGGC CAAGAACCCC AGGTCAGAGA ANGTGAGGCTTGCCACCATT TGGGAAGTGG 1260 CCCACTGCCA TTTTGGTAGC GGCCCACCAC CATCTTGGGAGCTGTGGGAG CAAGGATCCC 1320 CCAGTAACA 1329 162 amino acids amino acidsingle linear peptide 109 Gln Asn Arg Arg Ala Leu Asp Leu Leu Thr AlaLys Arg Gly Gly Thr 1 5 10 15 Cys Leu Phe Leu Gly Glu Glu Cys Cys XaaTyr Val Asn Gln Ser Gly 20 25 30 Ile Ile Thr Glu Lys Val Lys Glu Ile XaaAsp Arg Ile Xaa Cys Arg 35 40 45 Ala Glu Asp Leu Gln Asn Thr Ala Pro TrpGly Leu Leu Ser Gln Trp 50 55 60 Met Pro Trp Thr Leu Pro Phe Leu Gly ProLeu Ala Ala Ile Ile Phe 65 70 75 80 Leu Leu Leu Phe Gly Pro Cys Ile PheAsn Phe Leu Val Lys Phe Val 85 90 95 Ser Ser Arg Ile Glu Ala Val Lys LeuGln Ile Val Leu Gln Met Glu 100 105 110 Pro Gln Met Gln Ser Met Thr LysIle Tyr Arg Gly Pro Leu Asp Arg 115 120 125 Pro Ala Arg Leu Cys Ser AspVal Asn Asp Ile Glu Val Thr Pro Pro 130 135 140 Glu Glu Ile Ser Thr AlaGln Pro Leu Leu His Ser Asn Ser Val Gly 145 150 155 160 Ser Ser 21 basepairs nucleic acid single linear cDNA 110 GGCATTGATA GCACCCATCA G 21 21base pairs nucleic acid single linear cDNA 111 CATGTCACCA GGGTGGAATA G21 758 base pairs nucleic acid single linear cDNA 112 GGCATTGATAGCACCCATCA GATGGCCAAA TCATTATTTA CTGGACCAGG CCTTTTCAAA 60 ACTATCAAGCAGATAGGGCC CGTGAAGCAT GCCAAAGAAA TAATCCCCTG CCTTATCGCC 120 ATGTTCCTTCAGGAGAACAA AGAACAGGCC ATTACCCAGG GGAAGACTGG CAACTAGATT 180 TTACCCACATGGCCAAATGT CAGGGATTTC AGCATCTACT AGTCTGGGCA GATACTTTCA 240 CTGGTTGGGTGGAGTCTTCT CCTTGTAGGA CAGAAAAGAC CCAAGAGGTA ATAAAGGCAC 300 TAATGAAATAATTCCCAGAT TTGGACTTCC CCCAGGATTA CAGGGTGACA ATGGCCCCGC 360 TTTCAAGGCTGCAGTAACCC AGGGAGTATC CCAGGTGTTA GGCATACAAT ATCACTTACA 420 CTGTGCCTGGAGGCCACAAT CCTCCAGAAA AGTCAAGAAA ATGAATGAAA CACTCAAAGA 480 TCTAAAAAAGCTAACCCAAG AAACCCACAT TGCATGACCT GTTCTGTTGC CTATAACCTT 540 ACTAAGAATCCATAACTATC CCCCAAAAAG CAGGACTTAG CCCATACGAG ATGCTATATG 600 GATGGCCTTTCCTAACCAAT GACCTTGTGC TTGACTGAGA AATGGCCAAC TTAGTTGCAG 660 ACATCACCTCCTTAGCCAAA TATCAACAAG TTCTTAAAAC ATCACAGGGA ACCTGTCCCC 720 GAGAGGAGGGAAAGGAACTA TTCCACCCTG GTGACATG 758 25 base pairs nucleic acid singlelinear cDNA 113 CGGACATCCA AAGTGATGGG AAACG 25 26 base pairs nucleicacid single linear cDNA 114 GGACAGGAAA GTAAGACTGA GAAGGC 26 26 basepairs nucleic acid single linear cDNA 115 CCTAGAACGT ATTCTGGAGA ATTGGG26 26 base pairs nucleic acid single linear cDNA 116 TGGCTCTCAATGGTCAAACA TACCCG 26 1511 base pairs nucleic acid single linear cDNA 117CCTAGAACGT ATTCTGGAGA ATTGGGACCA ATGTGACACT CAGACGCTAA GAAAGAAACG 60ATTTATATTC TTCTGCAGTA CCGCCTGGCC ACAATATCCT CTTCAAGGGA GAGAAACCTG 120GCTTCCTGAG GGAAGTATAA ATTATAACAT CATCTTACAG CTAGACCTCT TCTGTAGAAA 180GGAGGGCAAA TGGAGTGAAG TGCCATATGT GCAAACTTTC TTTTCATTAA GAGACAACTC 240ACAATTATGT AAAAAGTGTG GTTTATGCCC TACAGGAAGC CCTCAGAGTC CACCTCCCTA 300CCCCAGCGTC CCCTCCCCGA CTCCTTCCTC AACTAATAAG GACCCCCCTT TAACCCAAAC 360GGTCCAAAAG GAGATAGACA AAGGGGTAAA CAATGAACCA AAGAGTGCCA ATATTCCCCG 420ATTATGCCCC CTCCAAGCAG TGAGAGGAGG AGAATTCGGC CCAGCCAGAG TGCCTGTACC 480TTTTTCTCTC TCAGACTTAA AGCAAATTAA AATAGACCTA GGTAAATTCT CAGATAACCC 540TGACGGCTAT ATTGATGTTT TACAAGGGTT AGGACAATCC TTTGATCTGA CATGGAGAGA 600TATAATGTTA CTACTAAATC AGACACTAAC CCCAAATGAG AGAAGTGCCG CTGTAACTGC 660AGCCCGAGAG TTTGGCGATC TTTGGTATCT CAGTCAGGCC AACAATAGGA TGACAACAGA 720GGAAAGAACA ACTCCCACAG GCCAGCAGGC AGTTCCCAGT GTAGACCCTC ATTGGGACAC 780AGAATCAGAA CATGGAGATT GGTGCCACAA ACATTTGCTA ACTTGCGTGC TAGAAGGACT 840GAGGAAAACT AGGAAGAAGC CTATGAATTA CTCAATGATG TCCACTATAA CACAGGGAAA 900GGAAGAAAAT CTTACTGCTT TTCTGGACAG ACTAAGGGAG GCATTGAGGA AGCATACCTC 960CCTGTCACCT GACTCTATTG AAGGCCAACT AATCTTAAAG GATAAGTTTA TCACTCAGTC 1020AGCTGCAGAC ATTAGAAAAA ACTTCAAAAG TCTGCCTTAG GCCCGGAGCA GAACTTAGAA 1080ACCCTATTTA ACTTGGCATC CTCAGTTTTT TATAATAGAG ATCAGGAGGA GCAGGCGAAA 1140CGGGACAAAC GGGATAAAAA AAAAAGGGGG GGTCCACTAC TTTAGTCATG GCCCTCAGGC 1200AAGCAGACTT TGGAGGCTCT GCAAAAGGGA AAAGCTGGGC AAATCAAATG CCTAATAGGG 1260CTGGCTTCCA GTGCGGTCTA CAAGGACACT TTAAAAAAGA TTATCCAAGT AGAAATAAGC 1320CGCCCCCTTG TCCATGCCCC TTACGTCAAG GGAATCACTG GAAGGCCCAC TGCCCCAGGG 1380GATGAAGATA CTCTGAGTCA GAAGCCATTA ACCAGATGAT CCAGCAGCAG GACTGAGGGT 1440GCCCGGGGCG AGCGCCAGCC CATGCCATCA CCCTCACAGA GCCCCGGGTA TGTTTGACCA 1500TTGAGAGCCA A 1511 352 amino acids amino acid single linear peptide 118Leu Glu Arg Ile Leu Glu Asn Trp Asp Gln Cys Asp Thr Gln Thr Leu 1 5 1015 Arg Lys Lys Arg Phe Ile Phe Phe Cys Ser Thr Ala Trp Pro Gln Tyr 20 2530 Pro Leu Gln Gly Arg Glu Thr Trp Leu Pro Glu Gly Ser Ile Asn Tyr 35 4045 Asn Ile Ile Leu Gln Leu Asp Leu Phe Cys Arg Lys Glu Gly Lys Trp 50 5560 Ser Glu Val Pro Tyr Val Gln Thr Phe Phe Ser Leu Arg Asp Asn Ser 65 7075 80 Gln Leu Cys Lys Lys Cys Gly Leu Cys Pro Thr Gly Ser Pro Gln Ser 8590 95 Pro Pro Pro Tyr Pro Ser Val Pro Ser Pro Thr Pro Ser Ser Thr Asn100 105 110 Lys Asp Pro Pro Leu Thr Gln Thr Val Gln Lys Glu Ile Asp LysGly 115 120 125 Val Asn Asn Glu Pro Lys Ser Ala Asn Ile Pro Arg Leu CysPro Leu 130 135 140 Gln Ala Val Arg Gly Gly Glu Phe Gly Pro Ala Arg ValPro Val Pro 145 150 155 160 Phe Ser Leu Ser Asp Leu Lys Gln Ile Lys IleAsp Leu Gly Lys Phe 165 170 175 Ser Asp Asn Pro Asp Gly Tyr Ile Asp ValLeu Gln Gly Leu Gly Gln 180 185 190 Ser Phe Asp Leu Thr Trp Arg Asp IleMet Leu Leu Leu Asn Gln Thr 195 200 205 Leu Thr Pro Asn Glu Arg Ser AlaAla Val Thr Ala Ala Arg Glu Phe 210 215 220 Gly Asp Leu Trp Tyr Leu SerGln Ala Asn Asn Arg Met Thr Thr Glu 225 230 235 240 Glu Arg Thr Thr ProThr Gly Gln Gln Ala Val Pro Ser Val Asp Pro 245 250 255 His Trp Asp ThrGlu Ser Glu His Gly Asp Trp Cys His Lys His Leu 260 265 270 Leu Thr CysVal Leu Glu Gly Leu Arg Lys Thr Arg Lys Lys Pro Met 275 280 285 Asn TyrSer Met Met Ser Thr Ile Thr Gln Gly Lys Glu Glu Asn Leu 290 295 300 ThrAla Phe Leu Asp Arg Leu Arg Glu Ala Leu Arg Lys His Thr Ser 305 310 315320 Leu Ser Pro Asp Ser Ile Glu Gly Gln Leu Ile Leu Lys Asp Lys Phe 325330 335 Ile Thr Gln Ser Ala Ala Asp Ile Arg Lys Asn Phe Lys Ser Leu Pro340 345 350 30 base pairs nucleic acid single linear cDNA 119 TGCTGGAATTCGGGATCCTA GAACGTATTC 30 30 base pairs nucleic acid single linear cDNA120 AGTTCTGCTC CGAAGCTTAG GCAGACTTTT 30 398 amino acids amino acidsingle linear peptide 121 Met Gly Ser Ser His His His His His His SerSer Gly Leu Val Pro 1 5 10 15 Arg Gly Ser His Met Ala Ser Met Thr GlyGly Gln Gln Met Gly Arg 20 25 30 Ile Leu Glu Arg Ile Leu Glu Asn Trp AspGln Cys Asp Thr Gln Thr 35 40 45 Leu Arg Lys Lys Arg Phe Ile Phe Phe CysSer Thr Ala Trp Pro Gln 50 55 60 Tyr Pro Leu Gln Gly Arg Glu Thr Trp LeuPro Glu Gly Ser Ile Asn 65 70 75 80 Tyr Asn Ile Ile Leu Gln Leu Asp LeuPhe Cys Arg Lys Glu Gly Lys 85 90 95 Trp Ser Glu Val Pro Tyr Val Gln ThrPhe Phe Ser Leu Arg Asp Asn 100 105 110 Ser Gln Leu Cys Lys Lys Cys GlyLeu Cys Pro Thr Gly Ser Pro Gln 115 120 125 Ser Pro Pro Pro Tyr Pro SerVal Pro Ser Pro Thr Pro Ser Ser Thr 130 135 140 Asn Lys Asp Pro Pro LeuThr Gln Thr Val Gln Lys Glu Ile Asp Lys 145 150 155 160 Gly Val Asn AsnGlu Pro Lys Ser Ala Asn Ile Pro Arg Leu Cys Pro 165 170 175 Leu Gln AlaVal Arg Gly Gly Glu Phe Gly Pro Ala Arg Val Pro Val 180 185 190 Pro PheSer Leu Ser Asp Leu Lys Gln Ile Lys Ile Asp Leu Gly Lys 195 200 205 PheSer Asp Asn Pro Asp Gly Tyr Ile Asp Val Leu Gln Gly Leu Gly 210 215 220Gln Ser Phe Asp Leu Thr Trp Arg Asp Ile Met Leu Leu Leu Asn Gln 225 230235 240 Thr Leu Thr Pro Asn Glu Arg Ser Ala Ala Val Thr Ala Ala Arg Glu245 250 255 Phe Gly Asp Leu Trp Tyr Leu Ser Gln Ala Asn Asn Arg Met ThrThr 260 265 270 Glu Glu Arg Thr Thr Pro Thr Gly Gln Gln Ala Val Pro SerVal Asp 275 280 285 Pro His Trp Asp Thr Glu Ser Glu His Gly Asp Trp CysHis Lys His 290 295 300 Leu Leu Thr Cys Val Leu Glu Gly Leu Arg Lys ThrArg Lys Lys Pro 305 310 315 320 Met Asn Tyr Ser Met Met Ser Thr Ile ThrGln Gly Lys Glu Glu Asn 325 330 335 Leu Thr Ala Phe Leu Asp Arg Leu ArgGlu Ala Leu Arg Lys His Thr 340 345 350 Ser Leu Ser Pro Asp Ser Ile GluGly Gln Leu Ile Leu Lys Asp Lys 355 360 365 Phe Ile Thr Gln Ser Ala AlaAsp Ile Arg Lys Asn Phe Lys Ser Leu 370 375 380 Pro Lys Leu Ala Ala AlaLeu Glu His His His His His His 385 390 395 378 amino acids amino acidsingle linear peptide 122 Met Ala Ser Met Thr Gly Gly Gln Gln Met GlyArg Ile Leu Glu Arg 1 5 10 15 Ile Leu Glu Asn Trp Asp Gln Cys Asp ThrGln Thr Leu Arg Lys Lys 20 25 30 Arg Phe Ile Phe Phe Cys Ser Thr Ala TrpPro Gln Tyr Pro Leu Gln 35 40 45 Gly Arg Glu Thr Trp Leu Pro Glu Gly SerIle Asn Tyr Asn Ile Ile 50 55 60 Leu Gln Leu Asp Leu Phe Cys Arg Lys GluGly Lys Trp Ser Glu Val 65 70 75 80 Pro Tyr Val Gln Thr Phe Phe Ser LeuArg Asp Asn Ser Gln Leu Cys 85 90 95 Lys Lys Cys Gly Leu Cys Pro Thr GlySer Pro Gln Ser Pro Pro Pro 100 105 110 Tyr Pro Ser Val Pro Ser Pro ThrPro Ser Ser Thr Asn Lys Asp Pro 115 120 125 Pro Leu Thr Gln Thr Val GlnLys Glu Ile Asp Lys Gly Val Asn Asn 130 135 140 Glu Pro Lys Ser Ala AsnIle Pro Arg Leu Cys Pro Leu Gln Ala Val 145 150 155 160 Arg Gly Gly GluPhe Gly Pro Ala Arg Val Pro Val Pro Phe Ser Leu 165 170 175 Ser Asp LeuLys Gln Ile Lys Ile Asp Leu Gly Lys Phe Ser Asp Asn 180 185 190 Pro AspGly Tyr Ile Asp Val Leu Gln Gly Leu Gly Gln Ser Phe Asp 195 200 205 LeuThr Trp Arg Asp Ile Met Leu Leu Leu Asn Gln Thr Leu Thr Pro 210 215 220Asn Glu Arg Ser Ala Ala Val Thr Ala Ala Arg Glu Phe Gly Asp Leu 225 230235 240 Trp Tyr Leu Ser Gln Ala Asn Asn Arg Met Thr Thr Glu Glu Arg Thr245 250 255 Thr Pro Thr Gly Gln Gln Ala Val Pro Ser Val Asp Pro His TrpAsp 260 265 270 Thr Glu Ser Glu His Gly Asp Trp Cys His Lys His Leu LeuThr Cys 275 280 285 Val Leu Glu Gly Leu Arg Lys Thr Arg Lys Lys Pro MetAsn Tyr Ser 290 295 300 Met Met Ser Thr Ile Thr Gln Gly Lys Glu Glu AsnLeu Thr Ala Phe 305 310 315 320 Leu Asp Arg Leu Arg Glu Ala Leu Arg LysHis Thr Ser Leu Ser Pro 325 330 335 Asp Ser Ile Glu Gly Gln Leu Ile LeuLys Asp Lys Phe Ile Thr Gln 340 345 350 Ser Ala Ala Asp Ile Arg Lys AsnPhe Lys Ser Leu Pro Lys Leu Ala 355 360 365 Ala Ala Leu Glu His His HisHis His His 370 375 25 base pairs nucleic acid single linear cDNA 123CTTGGAGGGT GCATAACCAG GGAAT 25 20 base pairs nucleic acid single linearcDNA 124 TGTCCGCTGT GCTCCTGATC 20 25 base pairs nucleic acid singlelinear cDNA 125 CTATGTCCTT TTGGACTGTT TGGGT 25 764 base pairs nucleicacid single linear cDNA 126 TGTCCGCTGT GCTCCTGATC CAGCACAGGC GCCCATTGCCTCTCCCAATT GGGCTAAAGG 60 CTTGCCATTG TTCCTGCACA GCTAAGTGCC TGGGTTCATCCTAATCGAGC TGAACACTAG 120 TCACTGGGTT CCACGGTTCT CTTCCATGAC CCATGGCTTCTAATAGAGCT ATAACACTCA 180 CTGCATGGTC CAAGATTCCA TTCCTTGGAA TCCGTGAGACCAAGAACCCC AGGTCAGAGA 240 ACACAAGGCT TGCCACCATG TTGGAAGCAG CCCACCACCATTTTGGAAGC AGCCCGCCAC 300 TATCTTGGGA GCTCTGGGAG CAAGGACCCC AGGTAACAATTTGGTGACCA CGAAGGGACC 360 TGAATCCGCA ACCATGAAGG GATCTCCAAA GCAATTGGAAATGTTCCTCC CAAGGCAAAA 420 ATGCCCCTAA GATGTATTCT GGAGAATTGG GACCAATTTGACCCTCAGAC AGTAAGAAAA 480 AAATGACTTA TATTCTTCTG CAGTACCGCC CTGGCCACGATATCCTCTTC AAGGGGGAGA 540 AACCTGGCCT CCTGAGGGAA GTATAAATTA TAACACCATCTTACAGCTAG ACCTGTTTTG 600 TAGAAAAGGA GGCAAATGGA GTGAAGTGCC ATATTTACAAACTTTCTTTT CATTAAAAGA 660 CAACTCGCAA TTATGTTAAC AGTGTGATTT GTGTTCCTACACGGAAGCCC TCAGATTCTA 720 CTCCCCACCC CCGGCATCTC CCCTGAATCC CTCCCCAACTTATT 764 800 base pairs nucleic acid single linear cDNA 127 TGTCCGCTGTGCTCCTGATC CAGCACAGGC GCCCATTGCC TCTCCCAATT GGGCTAAAGG 60 CTTGCCATTGTTCCTGCACA GCTAAGTGCC TGGGTTCATC CTAATCGAGC TGAACACTAG 120 TCACTGGGTTCCACGGTTCT CTTCCATGAC CCATGGCTTC TAATAGAGCT ATAACACTCA 180 CTGCATGGTCCAAGATTCCA TTCCTTGGAA TCCGTGAGAC CAAGAACCCC AGGTCAGAGA 240 ACACAAGGCTTGCCACCATG TTGGAAGCAG CCCACCACCA TTTTGGAAGC GGCCCGCCAC 300 TATCTTGGGAGCTCTGGGAG CAAGGACCCC CAGGTAACAA TTTGGTGACC ACGAAGGGAC 360 CTGAATCCGCAACCATGAAG GGATCTCCAA AGCAATTGGA AATGTTCCTC CCAAGGCAAA 420 AATGCCCCTAAGATGTATTC TGGAGAATTG GGACCAATCT GACCCTCAGA CAGTAAGAAA 480 AAAAATGACTTATATTCTTC TGCAGTACCG CCTGGCCACG GATATCCTCT TCAAGGGGGA 540 GAAACCTGGCCTCCTGAGGG AAGTATAAAT TATAACACCA TCTTACAGCT AGACCTGTTT 600 TGTAGAAAAGGAGGCAAATG GAGTGAAGTG CCATATTTAC AAACTTTCTT TTCATTAAAA 660 GACAACTCGCAATTATGTAA ACAGTGTGAT TTGTGTCCTA CAGGAAGCCC TCAGATCTAC 720 CTCCCTACCCCGGCATCTCC CTGACTCCTT CCCCAACTAA TAAGGACCCA CTTCAGCCCA 780 AACAGTCCAAAAGGACATAG 800 438 base pairs nucleic acid single linear cDNA 128GACTTGAGCC AGTCYTCATA CCTGGACAYT CTTGTCCTTC GGTACATGGA TGATTTACTT 60TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAARRTAG CCAGACCATT AAATACACGA 360ATTAAGGAAA CTCAAAAAGC CARTACCCAT TTAGTAAGAT GGACAYCTGA AGCAGAAGTG 420GCTTTCCAGG CCCTAAAG 438 438 base pairs nucleic acid single linear cDNA129 GACTTGAGCC AGTCCTCATA CCTGGACACT CTTGTCCTTC GGTACATGGA TGATTTACTT 60TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAGATAG CCAGACCATT AAATACACGA 360ATTAAGGAAA CTCAAAAAGC CAATACCCAT TTAGTAAGAT GGACACCTGA AGCAGAAGTG 420GCTTTCCAGG CCCTAAAG 438 438 base pairs nucleic acid single linear cDNA130 GACTTGAGCC AGTCCTCATA CCTGGACACT CTTGTCCTTC GGTACATGGA TGATTTACTT 60TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAAGTAG CCAGACCATT AAATACACGA 360ATTAAGGAAA CTCAAAAAGC CAGTACCCAT TTAGTAAGAT GGACACCTGA AGCAGAAGTG 420GCTTTCCAGG CCCTAAAG 438 438 base pairs nucleic acid single linear cDNA131 GACTTGAGCC AGTCYTCATA CCTGGACAYT CTTGTCCTTC GGTACATGGA TGATTTACTT 60TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAAATAG CCAGACCATT AAATACACGA 360ATTAAGGAAA CTCAAAAAGC CAATACCCAT TTAGTAAGAT GGACATCTGA AGCAGAAGTG 420GCTTTCCAGG CCCTAAAG 438 146 amino acids amino acid single linear peptide132 Asp Leu Ser Gln Ser Ser Tyr Leu Asp Thr Leu Val Leu Arg Tyr Met 1 510 15 Asp Asp Leu Leu Leu Ala Thr His Ser Glu Thr Leu Cys His Gln Ala 2025 30 Thr Gln Ala Leu Leu Asn Phe Leu Ala Thr Cys Gly Tyr Lys Val Ser 3540 45 Lys Pro Lys Ala Gln Leu Cys Ser Gln Gln Val Lys Tyr Leu Gly Leu 5055 60 Lys Leu Ser Lys Gly Thr Arg Thr Leu Ser Glu Glu Arg Ile Gln Pro 6570 75 80 Ile Leu Gly Tyr Pro His Pro Lys Thr Leu Lys Gln Leu Thr Ala Phe85 90 95 Leu Gly Ile Thr Gly Phe Cys Gln Ile Trp Ile Pro Arg Tyr Ser Lys100 105 110 Ile Ala Arg Pro Leu Asn Thr Arg Ile Lys Glu Thr Gln Lys AlaAsn 115 120 125 Thr His Leu Val Arg Trp Thr Pro Glu Ala Glu Val Ala PheGln Ala 130 135 140 Leu Lys 145 146 amino acids amino acid single linearpeptide 133 Asp Leu Ser Gln Ser Ser Tyr Leu Asp Thr Leu Val Leu Arg TyrMet 1 5 10 15 Asp Asp Leu Leu Leu Ala Thr His Ser Glu Thr Leu Cys HisGln Ala 20 25 30 Thr Gln Ala Leu Leu Asn Phe Leu Ala Thr Cys Gly Tyr LysVal Ser 35 40 45 Lys Pro Lys Ala Gln Leu Cys Ser Gln Gln Val Lys Tyr LeuGly Leu 50 55 60 Lys Leu Ser Lys Gly Thr Arg Thr Leu Ser Glu Glu Arg IleGln Pro 65 70 75 80 Ile Leu Gly Tyr Pro His Pro Lys Thr Leu Lys Gln LeuThr Ala Phe 85 90 95 Leu Gly Ile Thr Gly Phe Cys Gln Ile Trp Ile Pro ArgTyr Ser Lys 100 105 110 Val Ala Arg Pro Leu Asn Thr Arg Ile Lys Glu ThrGln Lys Ala Ser 115 120 125 Thr His Leu Val Arg Trp Thr Pro Glu Ala GluVal Ala Phe Gln Ala 130 135 140 Leu Lys 145 146 amino acids amino acidsingle linear peptide 134 Asp Leu Ser Gln Ser Ser Tyr Leu Asp Xaa LeuVal Leu Arg Tyr Met 1 5 10 15 Asp Asp Leu Leu Leu Ala Thr His Ser GluThr Leu Cys His Gln Ala 20 25 30 Thr Gln Ala Leu Leu Asn Phe Leu Ala ThrCys Gly Tyr Lys Val Ser 35 40 45 Lys Pro Lys Ala Gln Leu Cys Ser Gln GlnVal Lys Tyr Leu Gly Leu 50 55 60 Lys Leu Ser Lys Gly Thr Arg Thr Leu SerGlu Glu Arg Ile Gln Pro 65 70 75 80 Ile Leu Gly Tyr Pro His Pro Lys ThrLeu Lys Gln Leu Thr Ala Phe 85 90 95 Leu Gly Ile Thr Gly Phe Cys Gln IleTrp Ile Pro Arg Tyr Ser Lys 100 105 110 Ile Ala Arg Pro Leu Asn Thr ArgIle Lys Glu Thr Gln Lys Ala Asn 115 120 125 Thr His Leu Val Arg Trp ThrSer Glu Ala Glu Val Ala Phe Gln Ala 130 135 140 Leu Lys 145 429 basepairs nucleic acid single linear cDNA 135 GACTTGAGCC AGTCCTCATACCTGGACATT CTTGTTCTTC AGTATGGGGA TGAYTTRATT 60 ATAGCCACCC ATTCAGAAACCTTGTGGCAY CAAGCCACCC AAGYGCTCTT AAATTTCCTY 120 GCTACCTGTG GCTCCAAACAAAARGCTCAY CTCTGCTCAC AYCAGGTTAA ATACTTAGGG 180 CTAAAATTAT CCAAAGTCRCCAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGRT 240 TATCCYCATC CCAYAACCRTAAAGCAACTA AGARGGTTCC TTGGCATAYC AGCCTTCTGC 300 CGAATATGGA TTCCCRGATACAGYGAAATA GCCAGGCCAT TATGTACATT ARYTAAGGAA 360 ACTCAGAAAG CCAATACCCATATAGTAAGA TGGACACCTG AAACAGAAGT GGCTTTCCAG 420 GCCCTAAAG 429 429 basepairs nucleic acid single linear cDNA 136 GACTTGAGCC AGTCCTCATACCTGGACATT CTTGTTCTTC AGTATGGGGA TGACTTAATT 60 ATAGCCACCC ATTCAGAAACCTTGTGGCAT CAAGCCACCC AAGCGCTCTT AAATTTCCTT 120 GCTACCTGTG GCTCCAAACAAAAGGCTCAC CTCTGCTCAC ACCAGGTTAA ATACTTAGGG 180 CTAAAATTAT CCAAAGTCACCAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGCT 240 TATCCTCATC CCATAACCCTAAAGCAACTA AGAGGGTTCC TTGGCATATC AGCCTTCTGC 300 CGAATATGGA TTCCCGGATACAGTGAAATA GCCAGGCCAT TATGTACATT AATTAAGGAA 360 ACTCAGAAAG CCAATACCCATATAGTAAGA TGGACACCTG AAACAGAAGT GGCTTTCCAG 420 GCCCTAAAG 429 429 basepairs nucleic acid single linear cDNA 137 GACTTGAGCC AGTCCTCATACCTGGACATT CTTGTTCTTC AGTATGGGGA TGACTTAATT 60 ATAGCCACCC ATTCAGAAACCTTGTGGCAT CAAGCCACCC AAGTGCTCTT AAATTTCCTC 120 GCTACCTGTG GCTCCAAACAAAAGGCTCAC CTCTGCTCAC AGCAGGTTAA ATACTTAGGG 180 CTAAAATTAT CCAAAGTCGCCAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGAT 240 TATCCTCATC CCAAAACCATAAAGCAACTA AGAGGGTTCC TTGGCATAAC AGCCTTCTGC 300 CGAATATGGA TTCCCCGATACAGTGAAATA GCCAGGCCAT TATGTACATT AGTTAAGGAA 360 ACTCAGAAAG CCAATACCCATATAGTAAGA TGGACACCTG AGACAGAAGT GGCTTTCCAG 420 GCCCTAAAG 429 429 basepairs nucleic acid single linear cDNA 138 GACTTGAGCC AGTCCTCATACCTGGACATT CTTGTTCCTC AGTATGGGGA TGATTTAATT 60 ATAGCCACCC ATTCAGAAACCTTGTGGCAC CAAGCCACCC AAGCGCTCTT AAATTTCCTC 120 GCTACCTGTG GCTCCAAACAAAAGGCTCAG CTCTGCTCAC AGCAGGTTAA ATACTTAGGG 180 CTAAAATTAT CCAAAGTCACCAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGCT 240 TATCCCCATC CCAAAACCCTAAAGCAACTA AGARGGTTCC TTGGCATAAC AGCCTTCTGC 300 CGAATATGGA TTCCCAGATACAGCGAAATA GCCAGGCCAT TATGTACATT ATCTAAGGAA 360 ACTCAGAAAG CCAATACCCATATAGTAAGA TGGACACCTG AAACAGAAGT GGCTTTCCAG 420 GCCCTAAAG 429 143 aminoacids amino acid single linear peptide 139 Asp Leu Ser Gln Ser Ser TyrLeu Asp Ile Leu Val Leu Gln Tyr Gly 1 5 10 15 Asp Asp Leu Ile Ile AlaThr His Ser Glu Thr Leu Trp His Gln Ala 20 25 30 Thr Gln Ala Leu Leu AsnPhe Leu Ala Thr Cys Gly Ser Lys Gln Lys 35 40 45 Ala His Leu Cys Ser HisGln Val Lys Tyr Leu Gly Leu Lys Leu Ser 50 55 60 Lys Val Thr Arg Ala LeuArg Glu Glu Arg Ile Gln Arg Ile Leu Ala 65 70 75 80 Tyr Pro His Pro IleThr Leu Lys Gln Leu Arg Gly Phe Leu Gly Ile 85 90 95 Ser Ala Phe Cys ArgIle Trp Ile Pro Gly Tyr Ser Glu Ile Ala Arg 100 105 110 Pro Leu Cys ThrLeu Ile Lys Glu Thr Gln Lys Ala Asn Thr His Ile 115 120 125 Val Arg TrpThr Pro Glu Thr Glu Val Ala Phe Gln Ala Leu Lys 130 135 140 143 aminoacids amino acid single linear peptide 140 Asp Leu Ser Gln Ser Ser TyrLeu Asp Ile Leu Val Leu Gln Tyr Arg 1 5 10 15 Asp Asp Leu Ile Ile AlaThr His Ser Glu Thr Leu Trp His Gln Ala 20 25 30 Thr Gln Val Leu Leu AsnPhe Leu Ala Thr Cys Gly Ser Lys Gln Arg 35 40 45 Ala Gln Leu Cys Ser GlnGln Val Lys Tyr Leu Gly Leu Lys Leu Ser 50 55 60 Lys Val Ala Arg Ala LeuArg Glu Glu Arg Ile Gln Arg Ile Leu Asp 65 70 75 80 Tyr Pro His Pro LysThr Ile Lys Gln Leu Arg Gly Phe Leu Gly Ile 85 90 95 Thr Ala Phe Cys ArgIle Trp Ile Pro Arg Tyr Ser Glu Ile Ala Arg 100 105 110 Pro Leu Cys ThrLeu Val Lys Glu Thr Gln Lys Ala Asn Thr His Ile 115 120 125 Val Arg TrpThr Pro Glu Thr Glu Val Ala Phe Gln Ala Leu Lys 130 135 140 143 aminoacids amino acid single linear peptide 141 Asp Leu Ser Gln Ser Ser TyrLeu Asp Ile Leu Val Pro Gln Tyr Gly 1 5 10 15 Asp Asp Leu Ile Ile AlaThr His Ser Glu Thr Leu Trp His Gln Ala 20 25 30 Thr Gln Ala Leu Leu AsnPhe Leu Ala Thr Cys Gly Ser Lys Gln Lys 35 40 45 Ala Gln Leu Cys Ser GlnGln Val Lys Tyr Leu Gly Leu Lys Leu Ser 50 55 60 Lys Val Thr Arg Ala LeuArg Glu Glu Arg Ile Gln Arg Ile Leu Ala 65 70 75 80 Tyr Pro His Pro LysThr Leu Lys Gln Leu Arg Xaa Phe Leu Gly Ile 85 90 95 Thr Ala Phe Cys ArgIle Trp Ile Pro Arg Tyr Ser Glu Ile Ala Arg 100 105 110 Pro Leu Cys ThrLeu Ser Lys Glu Thr Gln Lys Ala Asn Thr His Ile 115 120 125 Val Arg TrpThr Pro Glu Thr Glu Val Ala Phe Gln Ala Leu Lys 130 135 140 25 basepairs nucleic acid single linear cDNA 142 GGCCAGGCAT CAGCCCAAGA CTTGA 2522 base pairs nucleic acid single linear cDNA 143 TGCAAGCTCA TCCCTSRGACCT 22 23 base pairs nucleic acid single linear cDNA 144 GACTTGAGCCAGTCCTCATA CCT 23 22 base pairs nucleic acid single linear cDNA 145CTTTAGGGCC TGGAAAGCCA CT 22 8 amino acids amino acid single linearpeptide 146 Phe Cys Ile Pro Val Arg Pro Asp 1 5 8 amino acids amino acidsingle linear peptide 147 Arg Pro Asp Ser Gln Phe Leu Phe 1 5 8 aminoacids amino acid single linear peptide 148 Thr Val Leu Pro Gln Gly PheArg 1 5 8 amino acids amino acid single linear peptide 149 Leu Phe GlyGln Ala Leu Ala Gln 1 5 8 amino acids amino acid single linear peptide150 Asp Ala Phe Phe Cys Ile Pro Val 1 5 8 amino acids amino acid singlelinear peptide 151 Ala Phe Phe Cys Ile Pro Val Arg 1 5 8 amino acidsamino acid single linear peptide 152 Phe Phe Cys Ile Pro Val Arg Pro 1 58 amino acids amino acid single linear peptide 153 Cys Ile Pro Val ArgPro Asp Ser 1 5 8 amino acids amino acid single linear peptide 154 IlePro Val Arg Pro Asp Ser Gln 1 5 8 amino acids amino acid single linearpeptide 155 Pro Val Arg Pro Asp Ser Gln Phe 1 5 8 amino acids amino acidsingle linear peptide 156 Val Arg Pro Asp Ser Gln Phe Leu 1 5 8 aminoacids amino acid single linear peptide 157 Pro Asp Ser Gln Phe Leu PheAla 1 5 8 amino acids amino acid single linear peptide 158 Asp Ser GlnPhe Leu Phe Ala Phe 1 5 8 amino acids amino acid single linear peptide159 Ser Gln Phe Leu Phe Ala Phe Glu 1 5 8 amino acids amino acid singlelinear peptide 160 Gln Phe Leu Phe Ala Phe Glu Asp 1 5 8 amino acidsamino acid single linear peptide 161 Phe Leu Phe Ala Phe Glu Asp Pro 1 58 amino acids amino acid single linear peptide 162 Ala Phe Glu Asp ProLeu Asn Pro 1 5 8 amino acids amino acid single linear peptide 163 PheGlu Asp Pro Leu Asn Pro Thr 1 5 8 amino acids amino acid single linearpeptide 164 Glu Asp Pro Leu Asn Pro Thr Ser 1 5 8 amino acids amino acidsingle linear peptide 165 Asp Pro Leu Asn Pro Thr Ser Gln 1 5 8 aminoacids amino acid single linear peptide 166 Pro Leu Asn Pro Thr Ser GlnLeu 1 5 8 amino acids amino acid single linear peptide 167 Leu Asn ProThr Ser Gln Leu Thr 1 5 8 amino acids amino acid single linear peptide168 Asn Pro Thr Ser Gln Leu Thr Trp 1 5 8 amino acids amino acid singlelinear peptide 169 Pro Thr Ser Gln Leu Thr Trp Thr 1 5 8 amino acidsamino acid single linear peptide 170 Thr Ser Gln Leu Thr Trp Thr Val 1 58 amino acids amino acid single linear peptide 171 Ser Gln Leu Thr TrpThr Val Leu 1 5 8 amino acids amino acid single linear peptide 172 GlnLeu Thr Trp Thr Val Leu Pro 1 5 8 amino acids amino acid single linearpeptide 173 Leu Thr Trp Thr Val Leu Pro Gln 1 5 8 amino acids amino acidsingle linear peptide 174 Thr Trp Thr Val Leu Pro Gln Gly 1 5 8 aminoacids amino acid single linear peptide 175 Trp Thr Val Leu Pro Gln GlyPhe 1 5 8 amino acids amino acid single linear peptide 176 Val Leu ProGln Gly Phe Arg Asp 1 5 8 amino acids amino acid single linear peptide177 Leu Pro Gln Gly Phe Arg Asp Ser 1 5 8 amino acids amino acid singlelinear peptide 178 Pro Gln Gly Phe Arg Asp Ser Pro 1 5 8 amino acidsamino acid single linear peptide 179 Gln Gly Phe Arg Asp Ser Pro His 1 58 amino acids amino acid single linear peptide 180 Gly Phe Arg Asp SerPro His Leu 1 5 8 amino acids amino acid single linear peptide 181 PheArg Asp Ser Pro His Leu Phe 1 5 8 amino acids amino acid single linearpeptide 182 Arg Asp Ser Pro His Leu Phe Gln 1 5 8 amino acids amino acidsingle linear peptide 183 Asp Ser Pro His Leu Phe Gly Gln 1 5 8 aminoacids amino acid single linear peptide 184 Ser Pro His Leu Phe Gly GlnAla 1 5 8 amino acids amino acid single linear peptide 185 Pro His LeuPhe Gly Gln Ala Leu 1 5 8 amino acids amino acid single linear peptide186 His Leu Phe Gly Gln Ala Leu Ala 1 5 16 base pairs nucleic acidsingle linear DNA (genomic) 187 TGGAAAGTGT TACCCC 16 15 base pairsnucleic acid single linear DNA (genomic) 188 AGTGTTACCC CAAGG 15 18 basepairs nucleic acid single linear DNA (genomic) 189 ATGTACCTAC TGTACGAC18 129 base pairs nucleic acid single linear DNA (genomic) 190TGGAAAGTAC TACCCCAAGG GTTTAAAAAT AGTCCCACCC TGTTCGAAAT GCAGCTGGCC 60CATATCCTGC AGCCCATTCG GCAAGCTTTC CCCCAATGCA CTATTCTTCA GTACATGGAT 120GACATTCTC 129 129 base pairs nucleic acid single linear DNA (genomic)191 TACAATGTGC TTCCACAGGG ATGGAAAGGA TCACCAGCAA TATTCCAAAG TAGCATGACA 60AAAATCTTAG AGCCTTTTAA AAAACAAAAT CCAGACATAG TTATCTATCA ATACATGGAT 120GATTTGTAT 129 129 base pairs nucleic acid single linear DNA (genomic)192 TGGACCAGAC TCCCACAGGG TTTCAAAAAC AGTCCCACCC TGTTTGATGA GGCACTGCAC 60AGAGACCTAG CAGACTTCCG GATCCAGCAC CCAGACTTGA TCCTGCTACA GTACGTGGAT 120GACTTACTG 129 129 base pairs nucleic acid single linear DNA (genomic)193 TGGAAGGTTT TACCACAAGG TATGGCCAAC AGTCCTACCT TATGTCAAAA ATATGTGGCC 60ACAGCCATAC ATAAGGTTAG ACATGCCTGG AAACAAATGT ATATTATACA TTACATGGAT 120GACATCCTA 129 123 base pairs nucleic acid single linear DNA (genomic)194 TGGATGGTCT TGCCCCAAGG GTTTAGGGAT AGCCCTCATC TGTTTGGTCA GGCCCTAGCC 60AAAGATCTAG GCCACTTCTC AAGTCCAGGC ACTCTGGTCC TTCAATATGT GGATGATTTA 120CTT 123 85 base pairs nucleic acid single linear DNA (genomic) 195GTTCAGGGAT AGCCCCCATC TATTTGGCCA GGCATTAGCC CAAGACTTGA GCCAATTCTC 60ATACCTGGAC ACTCTTGTCC TTCAG 85 23 base pairs nucleic acid single linearDNA (genomic) 196 CATCTNTTTG GNCAGGCANT AGC 23 24 base pairs nucleicacid single linear DNA (genomic) 197 CTTGAGCCAG TTCTCATACC TGGA 24 683amino acids amino acid single linear peptide 198 Ile Met Pro Glu Ser ProThr Pro Leu Leu Gly Arg Asp Ile Leu Ala 1 5 10 15 Lys Ala Gly Ala IleIle His Leu Asn Ile Gly Lys Gly Ile Pro Ile 20 25 30 Cys Cys Pro Leu LeuGlu Glu Gly Ile Asn Pro Glu Val Trp Ala Ile 35 40 45 Glu Gly Gln Tyr GlyGln Ala Lys Asn Ala Arg Pro Val Gln Val Lys 50 55 60 Leu Lys Asp Ser AlaSer Phe Pro Tyr Gln Arg Lys Tyr Pro Leu Arg 65 70 75 80 Pro Glu Ala LeuGln Gly Xaa Gln Lys Ile Val Lys Asp Leu Lys Ala 85 90 95 Gln Gly Leu ValLys Pro Cys Ser Ser Pro Cys Asn Thr Pro Ile Leu 100 105 110 Gly Val ArgLys Pro Asn Gly Gln Trp Arg Leu Val Gln Asp Leu Arg 115 120 125 Ile IleAsn Glu Ala Val Phe Pro Leu Tyr Pro Ala Val Ser Ser Pro 130 135 140 TyrThr Leu Leu Ser Leu Ile Pro Glu Glu Ala Glu Trp Phe Thr Val 145 150 155160 Leu Asp Leu Lys Asp Ala Phe Phe Cys Ile Pro Val Arg Pro Asp Ser 165170 175 Gln Phe Leu Phe Ala Phe Glu Asp Pro Leu Asn Pro Thr Ser Gln Leu180 185 190 Thr Trp Thr Val Leu Pro Gln Gly Phe Arg Asp Ser Pro His LeuPhe 195 200 205 Gly Gln Ala Leu Ala Gln Asp Leu Ser Gln Pro Ser Tyr LeuAsp Ile 210 215 220 Leu Val Leu Gln Tyr Val Asp Asp Leu Leu Leu Val AlaArg Ser Glu 225 230 235 240 Thr Leu Cys His Gln Ala Thr Gln Glu Leu LeuIle Phe Leu Thr Thr 245 250 255 Cys Gly Tyr Lys Val Ser Lys Pro Lys AlaArg Leu Cys Ser Gln Glu 260 265 270 Ile Arg Tyr Leu Gly Leu Lys Leu SerLys Gly Thr Arg Ala Leu Ser 275 280 285 Glu Glu Arg Ile Gln Pro Ile LeuAla Tyr Pro His Pro Lys Thr Leu 290 295 300 Lys Gln Leu Arg Gly Phe LeuGly Ile Thr Gly Phe Cys Arg Lys Gln 305 310 315 320 Ile Pro Arg Tyr ThrPro Ile Ala Arg Pro Leu Tyr Thr Leu Ile Arg 325 330 335 Glu Thr Gln LysAla Asn Thr Tyr Leu Val Arg Trp Thr Pro Thr Glu 340 345 350 Val Ala PheGln Ala Leu Lys Lys Ala Leu Thr Gln Ala Pro Val Phe 355 360 365 Ser LeuPro Thr Gly Gln Asp Phe Ser Leu Tyr Ala Thr Glu Lys Thr 370 375 380 GlyIle Ala Leu Gly Val Leu Thr Gln Val Ser Gly Met Ser Leu Gln 385 390 395400 Pro Val Val Tyr Leu Ser Lys Glu Ile Asp Val Val Ala Lys Gly Trp 405410 415 Pro His Cys Leu Trp Val Met Ala Ala Val Ala Val Leu Val Ser Glu420 425 430 Ala Val Lys Ile Ile Gln Gly Arg Asp Leu Thr Val Trp Thr SerHis 435 440 445 Asp Val Asn Gly Ile Leu Thr Ala Lys Gly Asp Leu Trp LeuSer Asp 450 455 460 Asn His Leu Leu Asn Tyr Gln Ala Leu Leu Leu Glu GluPro Val Leu 465 470 475 480 Arg Leu Arg Thr Cys Ala Thr Leu Gln Pro AlaThr Phe Leu Pro Asp 485 490 495 Asn Glu Glu Gln Ile Glu His Asn Cys GlnGln Val Ile Ala Gln Thr 500 505 510 Tyr Ala Ala Arg Gly Asp Leu Leu GluVal Pro Leu Thr Asp Pro Asp 515 520 525 Leu Asn Leu Tyr Thr Asp Gly SerSer Leu Ala Glu Lys Gly Leu Arg 530 535 540 Lys Ala Gly Tyr Ala Val IleSer Asp Asn Gly Ile Leu Glu Ser Asn 545 550 555 560 Arg Leu Thr Pro GlyThr Ser Ala His Leu Ala Glu Leu Ile Ala Leu 565 570 575 Thr Trp Ala LeuGlu Leu Gly Glu Gly Lys Arg Val Asn Ile Tyr Ser 580 585 590 Asp Ser LysTyr Ala Tyr Leu Val Leu His Ala His Ala Ala Ile Trp 595 600 605 Arg GluArg Glu Phe Leu Thr Ser Glu Gly Thr Pro Ile Asn His Gln 610 615 620 GluAla Ile Arg Arg Leu Leu Leu Ala Val Gln Lys Pro Lys Glu Val 625 630 635640 Ala Val Leu His Cys Gln Gly His Gln Glu Glu Glu Glu Arg Glu Ile 645650 655 Glu Gly Asn Arg Gln Ala Asp Ile Glu Ala Lys Lys Ala Ala Arg Gln660 665 670 Asp Ser Pro Leu Glu Met Leu Ile Glu Gly Pro 675 680 143amino acids amino acid single linear peptide 199 Asp Leu Ser Gln Ser SerTyr Leu Asp Ile Leu Val Leu Gln Tyr Gly 1 5 10 15 Asp Asp Leu Ile IleAla Thr His Ser Glu Thr Leu Trp His Gln Ala 20 25 30 Thr Gln Ala Leu LeuAsn Phe Leu Ala Thr Cys Gly Ser Lys Gln Lys 35 40 45 Ala Gln Leu Cys SerGln Gln Val Lys Tyr Leu Gly Leu Lys Leu Ser 50 55 60 Lys Val Thr Arg AlaLeu Arg Glu Glu Arg Ile Gln Arg Ile Leu Ala 65 70 75 80 Tyr Pro His ProLys Thr Leu Lys Gln Leu Arg Gly Phe Leu Gly Ile 85 90 95 Thr Ala Phe CysArg Ile Trp Ile Pro Arg Tyr Ser Glu Ile Ala Arg 100 105 110 Pro Leu CysThr Leu Xaa Lys Glu Thr Gln Lys Ala Asn Thr His Ile 115 120 125 Val ArgTrp Thr Pro Glu Thr Glu Val Ala Phe Gln Ala Leu Lys 130 135 140 683amino acids amino acid single linear peptide 200 Ile Met Pro Glu Ser ProThr Pro Leu Leu Gly Arg Asp Ile Leu Ala 1 5 10 15 Lys Ala Gly Ala IleIle His Leu Asn Ile Gly Lys Gly Ile Pro Ile 20 25 30 Cys Cys Pro Leu LeuGlu Glu Gly Ile Asn Pro Glu Val Trp Ala Ile 35 40 45 Glu Gly Gln Tyr GlyGln Ala Lys Asn Ala Arg Pro Val Gln Val Lys 50 55 60 Leu Lys Asp Ser AlaSer Phe Pro Tyr Gln Arg Lys Tyr Pro Leu Arg 65 70 75 80 Pro Glu Ala LeuGln Gly Xaa Gln Lys Ile Val Lys Asp Leu Lys Ala 85 90 95 Gln Gly Leu ValLys Pro Cys Ser Ser Pro Cys Asn Thr Pro Ile Leu 100 105 110 Gly Val ArgLys Pro Asn Gly Gln Trp Arg Leu Val Gln Asp Leu Arg 115 120 125 Ile IleAsn Glu Ala Val Phe Pro Leu Tyr Pro Ala Val Ser Ser Pro 130 135 140 TyrThr Leu Leu Ser Leu Ile Pro Glu Glu Ala Glu Trp Phe Thr Val 145 150 155160 Leu Asp Leu Lys Asp Ala Phe Phe Cys Ile Pro Val Arg Pro Asp Ser 165170 175 Gln Phe Leu Phe Ala Phe Glu Asp Pro Leu Asn Pro Thr Ser Gln Leu180 185 190 Thr Trp Thr Val Leu Pro Gln Gly Phe Arg Asp Ser Pro His LeuPhe 195 200 205 Gly Gln Ala Leu Ala Gln Asp Leu Ser Gln Pro Ser Tyr LeuAsp Ile 210 215 220 Leu Val Leu Gln Tyr Val Asp Asp Leu Leu Leu Val AlaArg Ser Glu 225 230 235 240 Thr Leu Cys His Gln Ala Thr Gln Glu Leu LeuIle Phe Leu Thr Thr 245 250 255 Cys Gly Tyr Lys Val Ser Lys Pro Lys AlaArg Leu Cys Ser Gln Glu 260 265 270 Ile Arg Tyr Leu Gly Leu Lys Leu SerLys Gly Thr Arg Ala Leu Ser 275 280 285 Glu Glu Arg Ile Gln Pro Ile LeuAla Tyr Pro His Pro Lys Thr Leu 290 295 300 Lys Gln Leu Arg Gly Phe LeuGly Ile Thr Gly Phe Cys Arg Lys Gln 305 310 315 320 Ile Pro Arg Tyr ThrPro Ile Ala Arg Pro Leu Tyr Thr Leu Ile Arg 325 330 335 Glu Thr Gln LysAla Asn Thr Tyr Leu Val Arg Trp Thr Pro Thr Glu 340 345 350 Val Ala PheGln Ala Leu Lys Lys Ala Leu Thr Gln Ala Pro Val Phe 355 360 365 Ser LeuPro Thr Gly Gln Asp Phe Ser Leu Tyr Ala Thr Glu Lys Thr 370 375 380 GlyIle Ala Leu Gly Val Leu Thr Gln Val Ser Gly Met Ser Leu Gln 385 390 395400 Pro Val Val Tyr Leu Ser Lys Glu Ile Asp Val Val Ala Lys Gly Trp 405410 415 Pro His Cys Leu Trp Val Met Ala Ala Val Ala Val Leu Val Ser Glu420 425 430 Ala Val Lys Ile Ile Gln Gly Arg Asp Leu Thr Val Trp Thr SerHis 435 440 445 Asp Val Asn Gly Ile Leu Thr Ala Lys Gly Asp Leu Trp LeuSer Asp 450 455 460 Asn His Leu Leu Asn Tyr Gln Ala Leu Leu Leu Glu GluPro Val Leu 465 470 475 480 Arg Leu Arg Thr Cys Ala Thr Leu Gln Pro AlaThr Phe Leu Pro Asp 485 490 495 Asn Glu Glu Gln Ile Glu His Asn Cys GlnGln Val Ile Ala Gln Thr 500 505 510 Tyr Ala Ala Arg Gly Asp Leu Leu GluVal Pro Leu Thr Asp Pro Asp 515 520 525 Leu Asn Leu Tyr Thr Asp Gly SerSer Leu Ala Glu Lys Gly Leu Arg 530 535 540 Lys Ala Gly Tyr Ala Val IleSer Asp Asn Gly Ile Leu Glu Ser Asn 545 550 555 560 Arg Leu Thr Pro GlyThr Ser Ala His Leu Ala Glu Leu Ile Ala Leu 565 570 575 Thr Trp Ala LeuGlu Leu Gly Glu Gly Lys Arg Val Asn Ile Tyr Ser 580 585 590 Asp Ser LysTyr Ala Tyr Leu Val Leu His Ala His Ala Ala Ile Trp 595 600 605 Arg GluArg Glu Phe Leu Thr Ser Glu Gly Thr Pro Ile Asn His Gln 610 615 620 GluAla Ile Arg Arg Leu Leu Leu Ala Val Gln Lys Pro Lys Glu Val 625 630 635640 Ala Val Leu His Cys Gln Gly His Gln Glu Glu Glu Glu Arg Glu Ile 645650 655 Glu Gly Asn Arg Gln Ala Asp Ile Glu Ala Lys Lys Ala Ala Arg Gln660 665 670 Asp Ser Pro Leu Glu Met Leu Ile Glu Gly Pro 675 680 438 basepairs nucleic acid single linear DNA (genomic) 201 GACTTGAGCC AGTCYTCATACCTGGACAYT CTTGTYCYTC RGTAYRKGGA TGAYTTAMTT 60 WTAGCCACCC ATTCAGAAACCTTGTGSCAY CAAGCCACCC AAGYRCTCTT AAATTTCCTY 120 GCTACCTGTG GCTACAAGGTTTCCAAACMA ARGGCTCASC TCTGCTCACA SCAGGTTAAA 180 TACTTAGGGC TAAAATTATCCAAAGKCRCC AGRRCCCTCA GWGAGGAACG TATCCAGCST 240 ATACTGGVTT ATCCYCATCCCAWAACCMTA AAGCAACTAA SARSGTTCCT TGGCATAWCA 300 GSYTTCTGCC RAATATGGATTCCCVGRTAC AGYRARRTAG CCAGRCCATT AWRTACAYKA 360 DYTAAGGAAA CTCARAAAGCCARTACCCAT WTAGTAAGAT GGACAYCTGA RRCAGAAGTG 420 GCTTTCCAGG CCCTAAAG 438146 amino acids amino acid single linear peptide 202 Asp Leu Ser Gln SerSer Tyr Leu Asp Thr Leu Val Leu Arg Tyr Met 1 5 10 15 Asp Asp Leu LeuLeu Ala Thr His Ser Glu Thr Leu Cys His Gln Ala 20 25 30 Thr Gln Ala LeuLeu Asn Phe Leu Ala Thr Cys Gly Tyr Lys Val Ser 35 40 45 Lys Pro Lys AlaGln Leu Cys Ser Gln Gln Val Lys Tyr Leu Gly Leu 50 55 60 Lys Leu Ser LysGly Thr Arg Thr Leu Ser Glu Glu Arg Ile Gln Pro 65 70 75 80 Ile Leu GlyTyr Pro His Pro Lys Thr Leu Lys Gln Leu Thr Ala Phe 85 90 95 Leu Gly IleThr Gly Phe Cys Gln Ile Trp Ile Pro Arg Tyr Ser Lys 100 105 110 Ile AlaArg Pro Leu Asn Thr Arg Ile Lys Glu Thr Gln Lys Ala Asn 115 120 125 ThrHis Leu Val Arg Trp Thr Pro Glu Ala Glu Val Ala Phe Gln Ala 130 135 140Leu Lys 145 143 amino acids amino acid single linear peptide 203 Asp LeuSer Gln Ser Ser Tyr Leu Asp Ile Leu Val Leu Gln Tyr Gly 1 5 10 15 AspAsp Leu Ile Ile Ala Thr His Ser Glu Thr Leu Trp His Gln Ala 20 25 30 ThrGln Ala Leu Leu Asn Phe Leu Ala Thr Cys Gly Ser Lys Gln Lys 35 40 45 AlaGln Leu Cys Ser Gln Gln Val Lys Tyr Leu Gly Leu Lys Leu Ser 50 55 60 LysVal Thr Arg Ala Leu Arg Glu Glu Arg Ile Gln Arg Ile Leu Ala 65 70 75 80Tyr Pro His Pro Lys Thr Leu Lys Gln Leu Arg Gly Phe Leu Gly Ile 85 90 95Thr Ala Phe Cys Arg Ile Trp Ile Pro Arg Tyr Ser Glu Ile Ala Arg 100 105110 Pro Leu Cys Thr Leu Xaa Lys Glu Thr Gln Lys Ala Asn Thr His Ile 115120 125 Val Arg Trp Thr Pro Glu Thr Glu Val Ala Phe Gln Ala Leu Lys 130135 140 146 amino acids amino acid single linear peptide 204 Asp Leu SerGln Ser Ser Tyr Leu Asp Xaa Leu Val Leu Xaa Tyr Xaa 1 5 10 15 Asp AspLeu Xaa Xaa Ala Thr His Ser Glu Thr Leu Xaa His Gln Ala 20 25 30 Thr GlnAla Leu Leu Asn Phe Leu Ala Thr Cys Gly Xaa Lys Xaa Xaa 35 40 45 Xaa XaaLys Ala Gln Leu Cys Ser Gln Gln Val Lys Tyr Leu Gly Leu 50 55 60 Lys LeuSer Lys Xaa Thr Arg Xaa Leu Xaa Glu Glu Arg Ile Gln Xaa 65 70 75 80 IleLeu Xaa Tyr Pro His Pro Lys Thr Leu Lys Gln Leu Xaa Xaa Phe 85 90 95 LeuGly Ile Thr Xaa Phe Cys Xaa Ile Trp Ile Pro Arg Tyr Ser Xaa 100 105 110Ile Ala Arg Pro Leu Xaa Thr Xaa Xaa Lys Glu Thr Gln Lys Ala Asn 115 120125 Thr His Xaa Val Arg Trp Thr Pro Glu Xaa Glu Val Ala Phe Gln Ala 130135 140 Leu Lys 145 1597 base pairs nucleic acid single linear DNA(genomic) 205 ATTATGCCTG AAAGCCCCAC TCCCTTGTTA GGGAGAGACA TTTTAGCAAAAGCAGGGGCC 60 ATTATACACC TGAACATAGG AAAAGGAATA CCCATTTGCT GTCCCCTGCTTGAGGAAGGA 120 ATTAATCCTG AAGTCTGGGC AATAGAAGGA CAATATGGAC AAGCAAAGAATGCCCGTCCT 180 GTTCAAGTTA AACTAAAGGA TTCTGCCTCC TTTCCCTACC AAAGGAAGTACCCTCTTAGA 240 CCCGAGGCCC TACAAGGANC TCAAAAGATT GTTAAGGACC TAAAAGCCCAAGGCCTAGTA 300 AAACCATGCA GTAGCCCCTG CAATACTCCA ATTTTAGGAG TAAGGAAACCCAACGGACAG 360 TGGAGGTTAG TGCAAGAACT CAGGATTATC AATGAGGCTG TTGTTCCTCTATACCCAGCT 420 GTACCTAACC CTTATACAGT GCTTTCCCAA ATACCAGAGG AAGCAGAGTGGTTTACAGTC 480 CTGGACCTTA AGGATGCCTT TTTCTGCATC CCTGTACGTC CTGACTCTCAATTCTTGTTT 540 GCCTTTGAAG ATCCTTTGAA CCCAACGTCT CAACTCACCT GGACTGTTTTACCCCAAGGG 600 TTCAGGGATA GCCCCCATCT ATTTGGCCAG GCATTAGCCC AAGACTTGAGTCAATTCTCA 660 TACCTGGACA CTCTTGTCCT TCAGTACATG GATGATTTAC TTTTAGTCGCCCGTTCAGAA 720 ACCTTGTGCC ATCAAGCCAC CCAAGAACTC TTAACTTTCC TCACTACCTGTGGCTACAAG 780 GTTTCCAAAC CAAAGGCTCG GCTCTGCTCA CAGGAGATTA GATACTNAGGGCTAAAATTA 840 TCCAAAGGCA CCAGGGCCCT CAGTGAGGAA CGTATCCAGC CTATACTGGCTTATCCTCAT 900 CCCAAAACCC TAAAGCAACT AAGAGGGTTC CTTGGCATAA CAGGTTTCTGCCGAAAACAG 960 ATTCCCAGGT ACASCCCAAT AGCCAGACCA TTATATACAC TAATTANGGAAACTCAGAAA 1020 GCCAATACCT ATTTAGTAAG ATGGACACCT ACAGAAGTGG CTTTCCAGGCCCTAAAGAAG 1080 GCCCTAACCC AAGCCCCAGT GTTCAGCTTG CCAACAGGGC AAGATTTTTCTTTATATGCC 1140 ACAGAAAAAA CAGGAATAGC TCTAGGAGTC CTTACGCAGG TCTCAGGGATGAGCTTGCAA 1200 CCCGTGGTAT ACCTGAGTAA GGAAATTGAT GTAGTGGCAA AGGGTTGGCCTCATNGTTTA 1260 TGGGTAATGG NGGCAGTAGC AGTCTNAGTA TCTGAAGCAG TTAAAATAATACAGGGAAGA 1320 GATCTTNCTG TGTGGACATC TCATGATGTG AACGGCATAC TCACTGCTAAAGGAGACTTG 1380 TGGTTGTCAG ACAACCATTT ACTTAANTAT CAGGCTCTAT TACTTGAAGAGCCAGTGCTG 1440 NGACTGCGCA CTTGTGCAAC TCTTAAACCC GCCACATTTC TTCCAGACAATGAAGAAAAG 1500 ATAGAACATA ACTGTCAACA AGTAATTGCT CAAACCTATG CTGCTCGAGGGGACCTTCTA 1560 GAGGTTCCCT TGACTGATCC CGACCTCAAC TTGTATA 1597 1600 basepairs nucleic acid single linear DNA (genomic) 206 ATTATGCCTG AAAGCCCCACTCCCTTGTTA GGGAGAGACA TTTTAGCAAA AGCAGGGGCC 60 ATTATACACC TGAACATAGGAAAAGGAATA CCCATTTGCT GTCCCCTGCT TGAGGAAGGA 120 ATTAATCCTG AAGTCTGGGCAATAGAAGGA CAATATGGAC AAGCAAAGAA TGCCCGTCCT 180 GTTCAAGTTA AACTAAAGGATTCTGCCTCC TTTCCCTACC AAAGGAAGTA CCCTCTTAGA 240 CCCGAGGCCC TACAAGGANCTCAAAAGATT GTTAAGGACC TAAAAGCCCA AGGCCTAGTA 300 AAACCATGCA GTAGCCCCTGCAATACTCCA ATTTTAGGAG TAAGGAAACC CAACGGACAG 360 TGGAGGTTAG TGCAAGAACTCAGGATTATC AATGAGGCTG TTGTTCCTCT ATACCCAGCT 420 GTACCTAACC CTTATACAGTGCTTTCCCAA ATACCAGAGG AAGCAGAGTG GTTTACAGTC 480 CTGGACCTTA AGGATGCCTTTTTCTGCATC CCTGTACGTC CTGACTCTCA ATTCTTGTTT 540 GCCTTTGAAG ATCCTTTGAACCCAACGTCT CAACTCACCT GGACTGTTTT ACCCCAAGGG 600 TTCAGGGATA GCCCCCATCTATTTGGCCAG GCATTAGCCC AAGACTTGAG YCARTYCTCA 660 TACCTGGACA YTCTTGTYCYTCAGTAYRKG GATGAYTTAM TTWTAGYCRC CCRTTCAGAA 720 ACCTTGTGSC AMCAAGCCACCCAAGHRCTC TTAAMTTTCC TYRCTACCTG TGGCTACAAG 780 GTTTCCAAAC MAARGGCTCRSCTCTGCTCA CASSAGRTTA RATACTNAGG GCTAAAATTA 840 TCCAAAGKCR CCAGGGCCCTCAGWGAGGAA CGTATCCAGC STATACTGGM TTATCCMCAT 900 CCCAWAACCM TAAAGCAACTAAGARGGTTC CTTGGCATAW CAGSYTTCTG CCGAAWAYRG 960 ATTCCCVGRT ACASYSMAATAGCCAGRCCA TTATRTACAY TADYTARGGA AACTCAGAAA 1020 GCCAATACCY ATWTAGTAAGATGGACACCT GARACAGAAG TGGCTTTCCA GGCCCTAAAG 1080 AAGGCCCTAA CCCAAGCCCCAGTGTTCAGC TTGCCAACAG GGCAAGATTT TTCTTTATAT 1140 GCCACAGAAA AAACAGGAATAGCTCTAGGA GTCCTTACGC AGGTCTCAGG GATGAGCTTG 1200 CAACCCGTGG TATACCTGAGTAAGGAAATT GATGTAGTGG CAAAGGGTTG GCCTCATNGT 1260 TTATGGGTAA TGGNGGCAGTAGCAGTCTNA GTATCTGAAG CAGTTAAAAT AATACAGGGA 1320 AGAGATCTTN CTGTGTGGACATCTCATGAT GTGAACGGCA TACTCACTGC TAAAGGAGAC 1380 TTGTGGTTGT CAGACAACCATTTACTTAAN TATCAGGCTC TATTACTTGA AGAGCCAGTG 1440 CTGNGACTGC GCACTTGTGCAACTCTTAAA CCCGCCACAT TTCTTCCAGA CAATGAAGAA 1500 AAGATAGAAC ATAACTGTCAACAAGTAATT GCTCAAACCT ATGCTGCTCG AGGGGACCTT 1560 CTAGAGGTTC CCTTGACTGATCCCGACCTC AACTTGTATA 1600 1600 base pairs nucleic acid single linearDNA (genomic) 207 ATTATGCCTG AAAGCCCCAC TCCCTTGTTA GGGAGAGACA TTTTAGCAAAAGCAGGGGCC 60 ATTATACACC TGAACATAGG AAAAGGAATA CCCATTTGCT GTCCCCTGCTTGAGGAAGGA 120 ATTAATCCTG AAGTCTGGGC AATAGAAGGA CAATATGGAC AAGCAAAGAATGCCCGTCCT 180 GTTCAAGTTA AACTAAAGGA TTCTGCCTCC TTTCCCTACC AAAGGAAGTACCCTCTTAGA 240 CCCGAGGCCC TACAAGGANC TCAAAAGATT GTTAAGGACC TAAAAGCCCAAGGCCTAGTA 300 AAACCATGCA GTAGCCCCTG CAATACTCCA ATTTTAGGAG TAAGGAAACCCAACGGACAG 360 TGGAGGTTAG TGCAAGAACT CAGGATTATC AATGAGGCTG TTGTTCCTCTATACCCAGCT 420 GTACCTAACC CTTATACAGT GCTTTCCCAA ATACCAGAGG AAGCAGAGTGGTTTACAGTC 480 CTGGACCTTA AGGATGCCTT TTTCTGCATC CCTGTACGTC CTGACTCTCAATTCTTGTTT 540 GCCTTTGAAG ATCCTTTGAA CCCAACGTCT CAACTCACCT GGACTGTTTTACCCCAAGGG 600 TTCAGGGATA GCCCCCATCT ATTTGGCCAG GCATTAGCCC AAGACTTGAGYCARTYYTCA 660 TACCTGGACA YTCTTGTYCY TCRGTACRTG GATGATTTAC TTTTAGYCRCCCRTTCAGAA 720 ACCTTGTGCC ATCAAGCCAC CCAAGMACTC TTAAMTTTCC TYRCTACCTGTGGCTACAAG 780 GTTTCCAAAC CAAAGGCTCR GCTCTGCTCA CAGSAGRTTA RATACTTAGGGCTAAAATTA 840 TCCAAAGGCA CCAGRRCCCT CAGTGAGGAA CGTATCCAGC CTATACTGGSTTATCCTCAT 900 CCCAAAACCC TAAAGCAACT AASAGSGTTC CTTGGCATAA CAGGTTTCTGCCRAAWAYRG 960 ATTCCCAGGT ACASCMMRRT AGCCAGACCA TTAWATACAC KAATTARGGAAACTCARAAA 1020 GCCARTACCY ATTTAGTAAG ATGGACAYCT GAAGCAGAAG TGGCTTTCCAGGCCCTAAAG 1080 AAGGCCCTAA CCCAAGCCCC AGTGTTCAGC TTGCCAACAG GGCAAGATTTTTCTTTATAT 1140 GCCACAGAAA AAACAGGAAT AGCTCTAGGA GTCCTTACGC AGGTCTCAGGGATGAGCTTG 1200 CAACCCGTGG TATACCTGAG TAAGGAAATT GATGTAGTGG CAAAGGGTTGGCCTCATNGT 1260 TTATGGGTAA TGGNGGCAGT AGCAGTCTNA GTATCTGAAG CAGTTAAAATAATACAGGGA 1320 AGAGATCTTN CTGTGTGGAC ATCTCATGAT GTGAACGGCA TACTCACTGCTAAAGGAGAC 1380 TTGTGGTTGT CAGACAACCA TTTACTTAAN TATCAGGCTC TATTACTTGAAGAGCCAGTG 1440 CTGNGACTGC GCACTTGTGC AACTCTTAAA CCCGCCACAT TTCTTCCAGACAATGAAGAA 1500 AAGATAGAAC ATAACTGTCA ACAAGTAATT GCTCAAACCT ATGCTGCTCGAGGGGACCTT 1560 CTAGAGGTTC CCTTGACTGA TCCCGACCTC AACTTGTATA 1600 683amino acids amino acid single linear peptide 208 Ile Met Pro Glu Ser ProThr Pro Leu Leu Gly Arg Asp Ile Leu Ala 1 5 10 15 Lys Ala Gly Ala IleIle His Leu Asn Ile Gly Lys Gly Ile Pro Ile 20 25 30 Cys Cys Pro Leu LeuGlu Glu Gly Ile Asn Pro Glu Val Trp Ala Ile 35 40 45 Glu Gly Gln Tyr GlyGln Ala Lys Asn Ala Arg Pro Val Gln Val Lys 50 55 60 Leu Lys Asp Ser AlaSer Phe Pro Tyr Gln Arg Lys Tyr Pro Leu Arg 65 70 75 80 Pro Glu Ala LeuGln Gly Xaa Gln Lys Ile Val Lys Asp Leu Lys Ala 85 90 95 Gln Gly Leu ValLys Pro Cys Ser Ser Pro Cys Asn Thr Pro Ile Leu 100 105 110 Gly Val ArgLys Pro Asn Gly Gln Trp Arg Leu Val Gln Asp Leu Arg 115 120 125 Ile IleAsn Glu Ala Val Phe Pro Leu Tyr Pro Ala Val Ser Ser Pro 130 135 140 TyrThr Leu Leu Ser Leu Ile Pro Glu Glu Ala Glu Trp Phe Thr Val 145 150 155160 Leu Asp Leu Lys Asp Ala Phe Phe Cys Ile Pro Val Arg Pro Asp Ser 165170 175 Gln Phe Leu Phe Ala Phe Glu Asp Pro Leu Asn Pro Thr Ser Gln Leu180 185 190 Thr Trp Thr Val Leu Pro Gln Gly Phe Arg Asp Ser Pro His LeuPhe 195 200 205 Gly Gln Ala Leu Ala Gln Asp Leu Ser Gln Pro Ser Tyr LeuAsp Thr 210 215 220 Leu Val Leu Gln Tyr Val Asp Asp Leu Leu Leu Val AlaArg Ser Glu 225 230 235 240 Thr Leu Cys His Gln Ala Thr Gln Glu Leu LeuIle Phe Leu Thr Thr 245 250 255 Cys Gly Tyr Lys Val Ser Lys Pro Lys AlaArg Leu Cys Ser Gln Glu 260 265 270 Ile Arg Tyr Leu Gly Leu Lys Leu SerLys Gly Thr Arg Ala Leu Ser 275 280 285 Glu Glu Arg Ile Gln Pro Ile LeuAla Tyr Pro His Pro Lys Thr Leu 290 295 300 Lys Gln Leu Arg Gly Phe LeuGly Ile Thr Gly Phe Cys Arg Lys Gln 305 310 315 320 Ile Pro Arg Tyr ThrPro Ile Ala Arg Pro Leu Tyr Thr Leu Ile Arg 325 330 335 Glu Thr Gln LysAla Asn Thr Tyr Leu Val Arg Trp Thr Pro Thr Glu 340 345 350 Val Ala PheGln Ala Leu Lys Lys Ala Leu Thr Gln Ala Pro Val Phe 355 360 365 Ser LeuPro Thr Gly Gln Asp Phe Ser Leu Tyr Ala Thr Glu Lys Thr 370 375 380 GlyIle Ala Leu Gly Val Leu Thr Gln Val Ser Gly Met Ser Leu Gln 385 390 395400 Pro Val Val Tyr Leu Ser Lys Glu Ile Asp Val Val Ala Lys Gly Trp 405410 415 Pro His Cys Leu Trp Val Met Ala Ala Val Ala Val Leu Val Ser Glu420 425 430 Ala Val Lys Ile Ile Gln Gly Arg Asp Leu Thr Val Trp Thr SerHis 435 440 445 Asp Val Asn Gly Ile Leu Thr Ala Lys Gly Asp Leu Trp LeuSer Asp 450 455 460 Asn His Leu Leu Asn Tyr Gln Ala Leu Leu Leu Glu GluPro Val Leu 465 470 475 480 Arg Leu Arg Thr Cys Ala Thr Leu Gln Pro AlaThr Phe Leu Pro Asp 485 490 495 Asn Glu Glu Lys Ile Glu His Asn Cys GlnGln Val Ile Ala Gln Thr 500 505 510 Tyr Ala Ala Arg Gly Asp Leu Leu GluVal Pro Leu Thr Asp Pro Asp 515 520 525 Leu Asn Leu Tyr Thr Asp Gly SerSer Leu Ala Glu Lys Gly Leu Arg 530 535 540 Lys Ala Gly Tyr Ala Val IleSer Asp Asn Gly Ile Leu Glu Ser Asn 545 550 555 560 Arg Leu Thr Pro GlyThr Ser Ala His Leu Ala Glu Leu Ile Ala Leu 565 570 575 Thr Trp Ala LeuGlu Leu Gly Glu Gly Lys Arg Val Asn Ile Tyr Ser 580 585 590 Asp Ser LysTyr Ala Tyr Leu Val Leu His Ala His Ala Ala Ile Trp 595 600 605 Arg GluArg Glu Phe Leu Thr Ser Glu Gly Thr Pro Ile Asn His Gln 610 615 620 GluAla Ile Arg Arg Leu Leu Leu Ala Val Gln Lys Pro Lys Glu Val 625 630 635640 Ala Val Leu His Cys Gln Gly His Gln Glu Glu Glu Glu Arg Glu Ile 645650 655 Glu Gly Asn Arg Gln Ala Asp Ile Glu Ala Lys Lys Ala Ala Arg Gln660 665 670 Asp Ser Pro Leu Glu Met Leu Ile Glu Gly Pro 675 680 146amino acids amino acid single linear peptide 209 Asp Leu Ser Gln Ser SerTyr Leu Asp Ile Leu Val Leu Arg Tyr Met 1 5 10 15 Asp Asp Leu Leu LeuAla Thr His Ser Glu Thr Leu Cys His Gln Ala 20 25 30 Thr Gln Ala Leu LeuAsn Phe Leu Ala Thr Cys Gly Tyr Lys Val Ser 35 40 45 Lys Pro Lys Ala GlnLeu Cys Ser Gln Gln Val Lys Tyr Leu Gly Leu 50 55 60 Lys Leu Ser Lys GlyThr Arg Ile Leu Ser Glu Glu Arg Ile Gln Pro 65 70 75 80 Ile Leu Gly TyrPro His Pro Lys Thr Leu Lys Gln Leu Thr Ala Phe 85 90 95 Leu Gly Ile ThrGly Phe Cys Gln Ile Trp Ile Pro Arg Tyr Ser Lys 100 105 110 Ile Ala ArgPro Leu Asn Thr Arg Ile Lys Glu Thr Gln Lys Ala Asn 115 120 125 Thr HisLeu Val Arg Trp Thr Pro Glu Ala Glu Val Ala Phe Gln Ala 130 135 140 LeuLys 145 683 amino acids amino acid single linear peptide 210 Ile Met ProGlu Ser Pro Thr Pro Leu Leu Gly Arg Asp Ile Leu Ala 1 5 10 15 Lys AlaGly Ala Ile Ile His Leu Asn Ile Gly Lys Gly Ile Pro Ile 20 25 30 Cys CysPro Leu Leu Glu Glu Gly Ile Asn Pro Glu Val Trp Ala Ile 35 40 45 Glu GlyGln Tyr Gly Gln Ala Lys Asn Ala Arg Pro Val Gln Val Lys 50 55 60 Leu LysAsp Ser Ala Ser Phe Pro Tyr Gln Arg Lys Tyr Pro Leu Arg 65 70 75 80 ProGlu Ala Leu Gln Gly Xaa Gln Lys Ile Val Lys Asp Leu Lys Ala 85 90 95 GlnGly Leu Val Lys Pro Cys Ser Ser Pro Cys Asn Thr Pro Ile Leu 100 105 110Gly Val Arg Lys Pro Asn Gly Gln Trp Arg Leu Val Gln Asp Leu Arg 115 120125 Ile Ile Asn Glu Ala Val Phe Pro Leu Tyr Pro Ala Val Ser Ser Pro 130135 140 Tyr Thr Leu Leu Ser Leu Ile Pro Glu Glu Ala Glu Trp Phe Thr Val145 150 155 160 Leu Asp Leu Lys Asp Ala Phe Phe Cys Ile Pro Val Arg ProAsp Ser 165 170 175 Gln Phe Leu Phe Ala Phe Glu Asp Pro Leu Asn Pro ThrSer Gln Leu 180 185 190 Thr Trp Thr Val Leu Pro Gln Gly Phe Arg Asp SerPro His Leu Phe 195 200 205 Gly Gln Ala Leu Ala Gln Asp Leu Ser Gln ProSer Tyr Leu Asp Thr 210 215 220 Leu Val Leu Gln Tyr Val Asp Asp Leu LeuLeu Val Ala Arg Ser Glu 225 230 235 240 Thr Leu Cys His Gln Ala Thr GlnGlu Leu Leu Ile Phe Leu Thr Thr 245 250 255 Cys Gly Tyr Lys Val Ser LysPro Lys Ala Arg Leu Cys Ser Gln Glu 260 265 270 Ile Arg Tyr Leu Gly LeuLys Leu Ser Lys Gly Thr Arg Ala Leu Ser 275 280 285 Glu Glu Arg Ile GlnPro Ile Leu Ala Tyr Pro His Pro Lys Thr Leu 290 295 300 Lys Gln Leu ArgGly Phe Leu Gly Ile Thr Gly Phe Cys Arg Lys Gln 305 310 315 320 Ile ProArg Tyr Thr Pro Ile Ala Arg Pro Leu Tyr Thr Leu Ile Arg 325 330 335 GluThr Gln Lys Ala Asn Thr Tyr Leu Val Arg Trp Thr Pro Thr Glu 340 345 350Val Ala Phe Gln Ala Leu Lys Lys Ala Leu Thr Gln Ala Pro Val Phe 355 360365 Ser Leu Pro Thr Gly Gln Asp Phe Ser Leu Tyr Ala Thr Glu Lys Thr 370375 380 Gly Ile Ala Leu Gly Val Leu Thr Gln Val Ser Gly Met Ser Leu Gln385 390 395 400 Pro Val Val Tyr Leu Ser Lys Glu Ile Asp Val Val Ala LysGly Trp 405 410 415 Pro His Cys Leu Trp Val Met Ala Ala Val Ala Val LeuVal Ser Glu 420 425 430 Ala Val Lys Ile Ile Gln Gly Arg Asp Leu Thr ValTrp Thr Ser His 435 440 445 Asp Val Asn Gly Ile Leu Thr Ala Lys Gly AspLeu Trp Leu Ser Asp 450 455 460 Asn His Leu Leu Asn Tyr Gln Ala Leu LeuLeu Glu Glu Pro Val Leu 465 470 475 480 Arg Leu Arg Thr Cys Ala Thr LeuGln Pro Ala Thr Phe Leu Pro Asp 485 490 495 Asn Glu Glu Lys Ile Glu HisAsn Cys Gln Gln Val Ile Ala Gln Thr 500 505 510 Tyr Ala Ala Arg Gly AspLeu Leu Glu Val Pro Leu Thr Asp Pro Asp 515 520 525 Leu Asn Leu Tyr ThrAsp Gly Ser Ser Leu Ala Glu Lys Gly Leu Arg 530 535 540 Lys Ala Gly TyrAla Val Ile Ser Asp Asn Gly Ile Leu Glu Ser Asn 545 550 555 560 Arg LeuThr Pro Gly Thr Ser Ala His Leu Ala Glu Leu Ile Ala Leu 565 570 575 ThrTrp Ala Leu Glu Leu Gly Glu Gly Lys Arg Val Asn Ile Tyr Ser 580 585 590Asp Ser Lys Tyr Ala Tyr Leu Val Leu His Ala His Ala Ala Ile Trp 595 600605 Arg Glu Arg Glu Phe Leu Thr Ser Glu Gly Thr Pro Ile Asn His Gln 610615 620 Glu Ala Ile Arg Arg Leu Leu Leu Ala Val Gln Lys Pro Lys Glu Val625 630 635 640 Ala Val Leu His Cys Gln Gly His Gln Glu Glu Glu Glu ArgGlu Ile 645 650 655 Glu Gly Asn Arg Gln Ala Asp Ile Glu Ala Lys Lys AlaAla Arg Gln 660 665 670 Asp Ser Pro Leu Glu Met Leu Ile Glu Gly Pro 675680

What is claimed is:
 1. An isolated nucleic acid comprising the pol geneof a retrovirus associated with multiple sclerosis or rheumatoidarthritis.
 2. The nucleic acid of claim 1, wherein said pol gene encodesa reverse transcriptase comprising an enzymatic site between amino aciddomains LPQG and YXDD, wherein said virus has a phylogenetic distancefrom HSERV-9 of 0.063±0.1.
 3. A nucleotide fragment comprising: (i) acoding nucleotide sequence selected from the group consisting of SEQ IDNO: 87, its complementary sequence, SEQ ID NO: 88, its complementarysequence, and sequences encoding the peptide sequence defined by SEQ IDNO: 89, or (ii) a portion of said coding nucleotide sequence, whichencodes a peptide that is recognized by sera of patients infected withthe MSRV-1 virus.
 4. A process for detecting, in a biological sample, avirus associated with multiple sclerosis or rheumatoid arthritis,comprising: contacting the nucleotide fragment of claim 3 with saidbiological sample, and determining whether the nucleotide fragmenthybridizes with a nucleic acid sequence in said biological sample,wherein hybridization indicates the presences of said virus.
 5. Anucleic acid probe for the detection of a virus associated with multiplesclerosis or rheumatoid arthritis, wherein said probe specificallyhybridizes with the nucleotide fragment of claim
 3. 6. The probe ofclaim 5, consisting of between 10 and 1,000 monomers.
 7. A primer forthe amplification by polymerization of a nucleic acid of a viralmaterial associated with multiple sclerosis or rheumatoid arthritis,comprising a nucleotide sequence having, for any succession of at least20 contiguous monomers, at least 70% homology with the nucleotidesequence of the fragment of claim
 3. 8. An isolated or purifiedpolypeptide encoded by an open reading frame of the nucleotide sequenceof the fragment of claim
 3. 9. The polypeptide of claim 8, wherein saidopen reading frame comprises, in the 5′ to 3′ direction, the sequencebetween nucleotide 18 and nucleotide 2301 of SEQ ID NO:
 87. 10. Thepolypeptide of claim 8, wherein said open reading frame is selected fromthe group consisting of a first open reading frame beginning atnucleotide 18 and ending at nucleotide 340 of SEQ ID NO: 87, a secondopen reading frame beginning at nucleotide 341 and ending at nucleotide2304 of SEQ ID NO: 87 and a third open reading frame beginning atnucleotide 1858 and ending at nucleotide 2304 of SEQ ID NO:
 87. 11. Anisolated or purified polypeptide selected from the group consisting of apolypeptide comprising peptide sequence SEQ ID NO: 90, a polypeptideconsisting of peptide sequence SEQ ID NO: 90, a polypeptide encoded byan open reading frame beginning at nucleotide 18 and ending atnucleotide 340 of SEQ ID NO: 87, and an equivalent polypeptide thereofwhich exhibits the proteolytic activity of a polypeptide of SEQ ID NO:90.
 12. An isolated or purified polypeptide selected from the groupconsisting of a polypeptide comprising peptide sequence SEQ ID NO: 91, apolypeptide consisting of peptide sequence SEQ ID NO: 91, a polypeptideencoded by an open reading frame beginning at nucleotide 341 and endingat nucleotide 2304 of SEQ ID NO: 87, and an equivalent polypeptidethereof which exhibits the reverse transcriptase activity of apolypeptide of SEQ ID NO:
 91. 13. An isolated or purified polypeptideselected from the group consisting of a polypeptide comprising peptidesequence SEQ ID NO: 92, a polypeptide consisting of peptide sequence SEQID NO: 92, a polypeptide encoded by an open reading frame beginning atnucleotide 1858 and ending at nucleotide 2304 of SEQ ID NO: 87, and anequivalent polypeptide which exhibits the ribonuclease activity of apolypeptide of SEQ ID NO:
 92. 14. Polypeptide of SEQ ID NO: 89.